I'm super interested in this project! I'd like to see how interop really works and how to get multiple data sources interconnected to produce amazing datasets.
It would be interesting to hear about your experience, did you do a write up somewhere or could I email you with few questions?
My project would be to put different data sources together + to allow users to upload their own structured data via Excel (think financial estimates). The current system has about 450 users, the next might have much more depending if it gets extended to other divisions.
I'm glad someone is doing this. It's a very difficult problem to solve, but I see it as similar to JSX in that the business logic and the underlying infra are often tightly coupled. There are certain kinds of data integration problems where I could see this being intuitive and useful.
I like the Datadex idea and the promise to make datasets as easily accessible as source code repositories. I can see a number of use cases already for a standardized and fast way to collaborate on structured data sets.
I agree. Everything should be collaborative. Feel free to reach out if you're interested in collaborating (email in my HN profile). I'm trying to get a v0 of this database out now, and building desirable use-cases is super important.
I think that's a very good idea for a new startup. Indeed, i have a client who asked me to build something like this for him. Hi got tons of stuff running on his business (mining company). Data are at everywhere (excel sheets, server, databases) and he want's me to integrate everything and extract a lot of information. No idea how to do such a thing.
You create sources of data and can have non-technical (or maybe non-developer) roles wire together these data pipelines with transformations, aggregations etc.
Thanks. That's a useful project! And yes, we aim to make data processing easy (through UI, low code) and easy to reuse/export (by converting UI steps 1:1 to code)
Thanks! We're trying to build a more complete data solution. The pre-built block types are the first step, which let you setup basic metrics. We want to build connectors with custom APIs, databases, and even data warehouses, and then let teams invite data engineers to setup metrics for non-technical users in their team to use.
Do you anticipate going more towards improving the data modelling capabilities (take on dbt et al) or more towards Business Intelligence (dashboards then hosting then drag&drop query builder all the way until the dreaded pdf export)
Something that is overlooked in the dbt direction is how complex data models get.
BI nothing seems overlooked, it is just hard!
I like that you have a clear anti-ICP [dbt customers, analysts]. This keeps you clear of the BI/DWH space. I do wonder how you avoid getting stuck in the BI tar pit [], or avoid getting stuck in the dbt middleware zone. Maybe with a core focus on engineers getting further and further without needing a BI/data team!
Really interesting use case. I'm super impressed with the concept, of automated data generation and gathering and having a common platform for the company. Perhaps the most impressive idea is that the data that is generated appears to be programatically accessible. I haven't worked in a large enterprise that has advanced past unsearchable and poorly indexed excel files that are manually manipulated/updated floating around on shared drives.
This wolfram implementation seems great, but I wonder about a lot of the annoying 'hard stuff' that would occur in a larger environment, handling of permissions different user roles etc.
hey I love the work that you are doing here, I have been wanting to create dev oriented data table service, like AirTable but more dev friendly with a strong plugin system. Want to have a quick chat?
It looks quite cool. I think it would be really useful if you have a lot of integrations into different programming languages, frameworks, and maybe even SQL servers.
So you could do data validation on the frontend, backend and the database server based on the same definitions.
It would save us a lot of bugs caused by different opinions of valid data in different layers of software.
I can imagine this being really useful from the ground up. Because it looks like it wants to be the source of truth, with different views on the data.
It’s hard to imagine it for a complex legacy application without having lots of added complexity. It wants to be the unifying programming model for the application. It would seem like running with two RDMS sources of truth simultaneously.
It’s like the xkcd “there are 12 ways of doing X, let’s create a standard to unify them” now there are 13 ways
reply