MotherDuck

MotherDuck · 2025-11-27T16:20:11.408Z

We’re incredibly grateful for our flock of users, partners, and the whole data community. Whether you're feasting or coding today, we appreciate you being on this journey with us. Happy Thanksgiving from the MotherDuck team! 💛

Data Infrastructure and Analytics

Data warehouse scaling in the cloud to make big data feel small using the speed, efficiency and ergonomics of DuckDB.

See jobs Follow

View all 110 employees

About us

Making analytics fun, frictionless and ducking awesome with a cloud data warehouse based on DuckDB's efficiency, ergonomics and performance in collaboration with the folks at DuckDB Labs.

Website: https://motherduck.com
External link for MotherDuck
Industry: Data Infrastructure and Analytics
Company size: 11-50 employees
Headquarters: Seattle
Type: Privately Held
Founded: 2022

Locations

Primary

Seattle, US

Get directions
New York, NY, US

Get directions
Amsterdam, NL

Get directions
San Francisco, CA, US

Get directions

Employees at MotherDuck

See all employees

Updates

MotherDuck

26,000 followers
2d
Report this post
If you think setting up an OLAP cache requires a massive infrastructure overhaul, think again. Simon Späti breaks down exactly how to leverage DuckDB community extensions to dramatically accelerate your dashboards with almost zero setup cost. A perfect example of how lightweight tools can solve heavy data problems.
Simon Späti
2d

OLAP isn't dead, right? 😉 Not to me, it hasn't been more alive than now. But what if OLAP itself isn't fast enough, or you use DuckDB and want to speed up caching, do you need a full-fledged OLAP system with re-investing all your analytics data? Maybe not. I went through OLAP caches for DuckDB in my recent deep dive and uncovered ways of simply adding two lines of code with community extensions: ``` SET GLOBAL cache_path = '/tmp/my_duckdb_cache.bin'; SET GLOBAL cache_enabled = true; ``` And you are good to go. Instant speed up with almost no setup cost. Have you tried it already? In this piece, I also elaborate on «The History of Caching BI Workloads» and the «Different Levels and Kinds of Caches» we have and are using today. Furthermore, I explore the typical obstacles when building a cache yourself. If that interests you, please read the attached essay with practical examples you can try out immediately. I hope you enjoy.
Like Comment Share
MotherDuck

26,000 followers
2d Edited
Report this post
The hardest data problems aren't always "big data" problems. They are complex data problems. Lucie T., Data Engineer at Opto Investments, shared her thoughts on why private markets need a different kind of Modern Data Stack, highlighting four specific challenges that standard tools often miss: 🌀 Schema Chaos: No CUSIP-like standardization means every fund structure is different. ⏳ Temporal Complexity: You need to track when you knew something, not just what changed. 📖 Context: A number (like IRR) means nothing without the story behind it. ⚖️ Regulatory Speed: The need to move fast without breaking fiduciary duty. She also details why MotherDuck has become their new standard for solving this. By moving away from massive clusters and Spark jobs, they get the power of a cloud warehouse with the simplicity of a local database. This allows for much faster iteration. Really appreciate the mention! Check out the full post here:

Optonaut Blog: Can the Modern Data Stack fix private markets' data chaos? thesis.optoinvest.com

1 Comment

Like Comment Share
MotherDuck

26,000 followers
3d
Report this post
Your boss calls you in. "Simple question," they say. "Is our business in good shape?" And you freeze. Because you know the data. You know that "good" depends on so many variables. It depends on what you mean by good. It depends on what kind of business you're referring to. What's an active user? We have conservative definitions and aggressive ones. Blah, blah, blah. So you give a 5 minute explanation full of footnotes. And your boss looks at you like you are crazy. This was the uncomfortable truth Benn Stancil laid out at Small Data SF. He pointed out that the modern data stack is built on a lot of faith. Companies hire expensive teams to look at esoteric numbers. And how do we know if any of it works? We hire more of them. They tell us. That faith is fragile. And it might be fracturing. Because if your stakeholder has to choose between a nuanced answer that takes weeks... or a directional "vibe" they get right now... they will take the vibe. They will choose the system that reads the support tickets and gives them a pulse check. Over the analyst who writes a giant paragraph explaining why the revenue numbers are technically complicated. Benn's not saying this is how it should be. He's saying this is what's happening. And the generation coming up behind us? They might not share our faith in quantification at all. Watch Benn's full talk from this year's Small Data SF here:

In the Long Run, Everything is a Fad: Benn Stancil (Small Data SF 2025)

https://www.youtube.com/

2 Comments

Like Comment Share
MotherDuck

26,000 followers
4d
Report this post
For years, doing geospatial analysis meant slow desktop software and complex infrastructure. We are excited to see Qiusheng Wu's new book, Spatial Data Management with DuckDB, clarifying the new blueprint for spatial analytics. It is not just a technical guide. It is a detailed look at how the industry is moving away from the "download and process" cycle toward high-performance, in-process SQL querying. The book explores how modern stacks can leverage out-of-core processing to handle massive datasets like global building footprints or national wetlands inventories without the traditional memory bottlenecks. This is a fantastic resource for anyone looking to modernize their geospatial workflows. Huge congratulations to Qiusheng on the launch! It is a massive contribution to the open-source and geospatial communities. Check out the book here: https://duckdb.gishub.org/

1 Comment

Like Comment Share
MotherDuck

26,000 followers
5d
Report this post
**Tomorrow** (Wednesday PST), Jacob Matson and Cody Peterson will do a Hands-On Lab on Agentic Data Engineering! Join us - RSVP using link in comments - for a 45-minute hands-on lab where you'll combine MotherDuck and Ascend to build a complete data pipeline, while leveraging AI agents that handle operational work for you. You'll experience first-hand how high-performing data teams are using these technologies to deliver trusted data faster and more efficiently. Together we'll: - build end-to-end data pipelines on MotherDuck - deploy agents to automate pipeline operations and workflows - implement best practices for deploying pipelines and agentic workflows in production By the end, you'll have a working pipeline and hands-on experience with both MotherDuck and Ascend.
1 Comment

Like Comment Share
MotherDuck

26,000 followers
5d
Report this post
New to MotherDuck? We're running a live session on December 11 at 9:30am PT to walk through the basics: getting data in, what you can do with it, and how it fits into your workflow. Stick around for Q&A if you have questions about your setup. Register:

Getting Started with MotherDuck - Dec 11 · Luma luma.com

Like Comment Share
MotherDuck

26,000 followers
6d
Report this post
Headed to re:Invent this week? Join MotherDuck and friends tomorrow night to shake your tail feathers or just vibe out with a cocktail in hand. We're joining our pals at Tines Supabase Semgrep Felicis to host a happy hour for all you intrepid re:Invent goers. December 2 (tomorrow!), 5:30pm at Tableau at the Wynn. See you there!! Register here: https://lnkd.in/gk7q8NX5
Like Comment Share
MotherDuck

26,000 followers
6d Edited
Report this post
"We have great datasets." The data: 47 variations of "St. Albans" in one column. This post from Adam Sroka hit r/dataengineering hard because every DE has lived it (https://lnkd.in/grJH7ak9). And the frustration goes deeper than messy string, it's that nobody wants to invest in fixing it. Everyone wants clean data, but no one wants to pay for it. We asked four senior data engineers how they deal with it. The consensus? Stop trying to fix everything at once. 📋 Use the WAP technique (Write, Audit, Publish): Mehdi Ouazza advises writing data to a staging area and auditing it before publishing. Better to have no data than bad data. 🔄 Start small and iterate: Julien Hurault suggests shipping pipelines with basic tests and adding more as they fail. Avoid over-engineering from day one. 💬 Learn through exposure: Simon Späti notes that you have to see "really bad" data to understand what data quality even means. Talk to business experts to learn which data actually matters. 🎯Focus on key datasets: Benjamin Rogojan warns against alert fatigue - "don't boil the ocean." If you alert on everything, people ignore everything. Fix the critical data first. Data quality is just one of 10 top-upvoted r/dataengineering questions we tackled with our expert panel. Read the full blog here: https://lnkd.in/gBJg-wTX
53 Comments

Like Comment Share
MotherDuck

26,000 followers
1w
Report this post
Your pipeline just corrupted customer records. You need to roll back. But you can't just revert one table. Sales depends on customers, which depends on products, which changed two hours ago. This is what "Git for Data" promises to solve. But how do we efficiently fork data without waiting forever or spending a fortune? In part 1 of our deep dive with Simon Späti, we laid out a spectrum of data movement efficiency, ordered from least to most efficient: 1️⃣ Full 1:1 copying - Simple to understand but slow and expensive, especially at scale. 2️⃣ Delta-based changes - Only store what changed. Revert by pointing to the previous state. 3️⃣ Zero-copy virtualization - Share data between systems without serialization overhead. 4️⃣ Metadata/catalog-based versioning - Create logical versions with just pointer changes. No data duplication. The key insight behind #4: when you change one row in a million-row table, you only need to update that chunk. Everything else is shared between versions. Diffing scales with what changed, not total dataset size. Read the full blog:

Branch, Test, Deploy: A Git-Inspired Approach for Data - MotherDuck Blog motherduck.com

Like Comment Share
MotherDuck

26,000 followers
1w
Report this post
We’re incredibly grateful for our flock of users, partners, and the whole data community. Whether you're feasting or coding today, we appreciate you being on this journey with us. Happy Thanksgiving from the MotherDuck team! 💛
Like Comment Share

Browse jobs

Funding

MotherDuck 3 total rounds

Last Round

Series B Oct 20, 2023

US$ 52.5M

Investors

Felicis + 5 Other investors

See more info on crunchbase

MotherDuck

Data Infrastructure and Analytics

Data warehouse scaling in the cloud to make big data feel small using the speed, efficiency and ergonomics of DuckDB.

About us

Locations

Employees at MotherDuck

Corne Jansen

Margaret Lawrence Rosas

Dave Zilberman

Nicholas Ursa

Updates

In the Long Run, Everything is a Fad: Benn Stancil (Small Data SF 2025)

https://www.youtube.com/

Join now to see what you are missing

Similar pages

DuckDB

Superblocks

ClickHouse

Polars

dbt Labs

Estuary

Omni

Hex

Magic

Airbyte

Browse jobs

Engineer jobs

Analyst jobs

Developer jobs

Platform Engineer jobs

Senior Software Engineer jobs

Director of Recruiting jobs

Account Executive jobs

Director jobs

Solutions Engineer jobs

Director of Analytics jobs

Principal Product Manager jobs

Software Engineer jobs

Data Engineer jobs

General Counsel jobs

Scientist jobs

Director of Business Development jobs

Human Resources Director jobs

Frontend Developer jobs

Attorney jobs

Associate jobs

Funding