photog.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A place for your photos and banter. Photog first is our motto Please refer to the site rules before posting.

Administered by:

Server stats:

244
active users

#DataEngineering

0 posts0 participants0 posts today
Sarah Lea<p>Why normalize databases?<br>Yesterday, my tutoring student asked me why databases need to be normalized at all. She said: “Wouldn’t it be easier to just have one big table with all the information?”</p><p>It’s a common first question when learning about relational databases.<br>At first, one big table (e.g. customer name, order date, product name, price) seems easiest.</p><p>I told her:<br>:blobcoffee: Because that quickly leads to data redundancy, anomalies, and integrity issues when inserting, updating, or deleting records.<br>:blobcoffee: Normalization means structuring data into separate, related tables, so that each fact is stored only once. This reduces redundancy &amp; preserves consistency.</p><p><a href="https://techhub.social/tags/databases" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>databases</span></a> <a href="https://techhub.social/tags/dataengineering" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dataengineering</span></a> <a href="https://techhub.social/tags/datascience" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>datascience</span></a> <a href="https://techhub.social/tags/datascientist" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>datascientist</span></a> <a href="https://techhub.social/tags/dataanalysis" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dataanalysis</span></a> <a href="https://techhub.social/tags/dataanalyst" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dataanalyst</span></a> <a href="https://techhub.social/tags/sql" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>sql</span></a> <a href="https://techhub.social/tags/data" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>data</span></a></p>
HackerNoon<p>Discover how CocoIndex transforms data orchestration with a pure Data Flow Programming model — ensuring traceable, immutable, and declarative pipelines for know <a href="https://hackernoon.com/redefining-data-operations-with-data-flow-programming-in-cocoindex-u486ao8" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">hackernoon.com/redefining-data</span><span class="invisible">-operations-with-data-flow-programming-in-cocoindex-u486ao8</span></a> <a href="https://mas.to/tags/dataengineering" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dataengineering</span></a></p>
Will Hopkins 🌈📸<p><a href="https://a2mi.social/tags/dataengineering" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dataengineering</span></a> If you needed to use a data lake with Redshift, would you use Iceberg, given some native support, over Delta Lake, which is arguably a better format?</p><p>Asking for a friend who is me</p>
James Bartlett :terminal:<p>🧙‍♂️ One does not simply build reports on OLTP data…</p><p>This week on The Drill Down with Ahmad &amp; James, our special guest <br>Kristyna Ferris will be presenting a session titled "The Fellowship of the Star Schema: Transforming OLTP Data for Power BI" </p><p>🛠️ This session is packed with:<br>- Clear distinctions between OLTP &amp; OLAP<br>- Tips for building Power BI-ready models<br>- A sprinkle of Slowly Changing Dimension magic</p><p>💡Whether you’re a data wizard 🧙, business hobbit 🧝‍♀️, or SQL ranger 🏹 — this is your quest.</p><p>🗓️ Join us LIVE on LinkedIn | Wednesday, July 2nd @ 2PM Central<br><a href="https://lnkd.in/eWh4SsBb" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">lnkd.in/eWh4SsBb</span><span class="invisible"></span></a></p><p><a href="https://techhub.social/tags/TheDrillDown" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>TheDrillDown</span></a> <a href="https://techhub.social/tags/MicrosoftFabric" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>MicrosoftFabric</span></a> <a href="https://techhub.social/tags/PowerBI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>PowerBI</span></a> <a href="https://techhub.social/tags/DataEngineering" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataEngineering</span></a> <a href="https://techhub.social/tags/DataTransformation" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataTransformation</span></a> <a href="https://techhub.social/tags/DataAnalytics" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataAnalytics</span></a> <a href="https://techhub.social/tags/BusinessIntelligence" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>BusinessIntelligence</span></a> <a href="https://techhub.social/tags/StarSchema" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>StarSchema</span></a> <a href="https://techhub.social/tags/OLTP" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OLTP</span></a> <a href="https://techhub.social/tags/KristynaFerris" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>KristynaFerris</span></a></p>
⚯ Michel de Cryptadamus ⚯<p>pro tip for user interface designers:</p><p>if you have hundreds of millions of dollars of venture capital and you want to make a user facing data analytics tool of some kind and you think it's reasonable to ask an average human being to type this:</p><p> CAST('2023-05-01' AS TIMESTAMP)</p><p>to do literally anything with a date or time in your application's user interface, just stop right there. do not pass go, do not collect $200, and do not ever attempt to offer feedback to a UX designer ever again. something is deeply broken inside you that means there are certain mysteries of the universe that even the guys who designed the postgres command line can access that you will never know, and that's ok. You can still live a really rad life.</p><p><a href="https://universeodon.com/tags/SQL" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SQL</span></a> <a href="https://universeodon.com/tags/dba" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dba</span></a> <a href="https://universeodon.com/tags/dataengineering" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dataengineering</span></a> <a href="https://universeodon.com/tags/postgres" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>postgres</span></a></p>
⚯ Michel de Cryptadamus ⚯scariest shit i've seen in years
dealingwith<p>If anyone knows Data Engineers looking for work, this is our next hire: <a href="https://www.linkedin.com/posts/dealingwith_dataengineering-hiring-startuplife-activity-7338312558455476224-jvGh" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">linkedin.com/posts/dealingwith</span><span class="invisible">_dataengineering-hiring-startuplife-activity-7338312558455476224-jvGh</span></a></p><p><a href="https://billee.applytojob.com/apply/iTXqZOqOUu/Senior-Data-Engineer" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">billee.applytojob.com/apply/iT</span><span class="invisible">XqZOqOUu/Senior-Data-Engineer</span></a></p><p><a href="https://indieweb.social/tags/DataEngineering" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataEngineering</span></a> <a href="https://indieweb.social/tags/hiring" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>hiring</span></a> <a href="https://indieweb.social/tags/getfedihired" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>getfedihired</span></a> <a href="https://indieweb.social/tags/FediHire" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>FediHire</span></a></p>
Mike Spencer<p>A great job with a fantastic group: <a href="https://www.dataorchard.org.uk/analytics-engineer-vacancy" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">dataorchard.org.uk/analytics-e</span><span class="invisible">ngineer-vacancy</span></a></p><p><a href="https://mastodon.scot/tags/DataScience" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataScience</span></a> <a href="https://mastodon.scot/tags/DataEngineering" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataEngineering</span></a> <a href="https://mastodon.scot/tags/RStats" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>RStats</span></a> <a href="https://mastodon.scot/tags/JobFairy" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>JobFairy</span></a> <a href="https://mastodon.scot/tags/FediHire" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>FediHire</span></a> <span class="h-card" translate="no"><a href="https://data-folks.masto.host/@data_orchard" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>data_orchard</span></a></span></p>
Francois Dion<p>Picked up "Python Polars the definitive guide" by Jeroen Janssens and Thijs Nieuwdorp. The polar bear was already used on another O'Reilly book, but the Iberian lynx is cool.</p><p>Never sure how tech books will pan out, but Jeroen's book data science at the command line was a good one, so I am hopeful.</p><p><a href="https://mastodon.online/tags/python" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>python</span></a> <a href="https://mastodon.online/tags/polars" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>polars</span></a> <a href="https://mastodon.online/tags/dataframes" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dataframes</span></a> <a href="https://mastodon.online/tags/datascience" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>datascience</span></a> <a href="https://mastodon.online/tags/DataEngineering" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataEngineering</span></a> <a href="https://mastodon.online/tags/dataops" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dataops</span></a> <a href="https://mastodon.online/tags/book" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>book</span></a> <a href="https://mastodon.online/tags/books" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>books</span></a> <a href="https://mastodon.online/tags/computerscience" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>computerscience</span></a> <a href="https://mastodon.online/tags/analytics" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>analytics</span></a></p>
gaby_wald<p>🚀 <a href="https://framapiaf.org/tags/OpenToWork" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenToWork</span></a> | Data Engineer | ETL &amp; Contrôle Qualité <br>CV PDF : <a href="http://gabriel.chandesris.free.fr/gabysblog/docs/CVGabrielChandesris.pdf" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">http://</span><span class="ellipsis">gabriel.chandesris.free.fr/gab</span><span class="invisible">ysblog/docs/CVGabrielChandesris.pdf</span></a><br>Expert <a href="https://framapiaf.org/tags/ETL" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ETL</span></a> <a href="https://framapiaf.org/tags/Python" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Python</span></a> <a href="https://framapiaf.org/tags/SQL" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SQL</span></a> <a href="https://framapiaf.org/tags/DataQuality" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataQuality</span></a> <a href="https://framapiaf.org/tags/BigData" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>BigData</span></a><br>Prêt à optimiser vos pipelines de données !<br>🙏 RT plz <a href="https://framapiaf.org/tags/i4emploi" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>i4emploi</span></a> <a href="https://framapiaf.org/tags/Recrutement" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Recrutement</span></a> <a href="https://framapiaf.org/tags/Emploi" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Emploi</span></a> <a href="https://framapiaf.org/tags/DataEngineering" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataEngineering</span></a> <a href="https://framapiaf.org/tags/Spark" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Spark</span></a> <a href="https://framapiaf.org/tags/Scala" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Scala</span></a> ...</p>
Ralph Straumann (@rastrau)<p>Immer wieder wird im Geschäftskontext über <a href="https://swiss.social/tags/Datenqualit%C3%A4t" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Datenqualität</span></a> gesprochen, oft zusammen mit «authoritativeness», Entstehungskontext, <a href="https://swiss.social/tags/Governance" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Governance</span></a>-Modelle etc. Aber es lohnt sich meines Erachtens, zuerst die Begrifflichkeiten und die Bedeutung von <a href="https://swiss.social/tags/Daten" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Daten</span></a>-qualität zu klären. Beginn eines Versuchs: <a href="https://digital.ebp.ch/2025/04/29/datenqualitaet" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">digital.ebp.ch/2025/04/29/date</span><span class="invisible">nqualitaet</span></a> <a href="https://swiss.social/tags/DataManagement" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataManagement</span></a> <a href="https://swiss.social/tags/DataEngineering" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataEngineering</span></a> <a href="https://swiss.social/tags/DataScience" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataScience</span></a></p>
thomas yager-madden<p>Doing the laundry is a good analogy for a lot of <a href="https://tilde.zone/tags/dataEngineering" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dataEngineering</span></a> work. It’s literally batch processing innit</p>
Olivier D'Hondt 🛰️🌍🌱<p><a href="https://framapiaf.org/tags/dask" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dask</span></a> is strange. Sometimes using the dask counterpart to numpy functions or arrays makes computations slower. Sometimes not. Also, lots of variability in runtime. <a href="https://framapiaf.org/tags/python" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>python</span></a> <a href="https://framapiaf.org/tags/dataengineering" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dataengineering</span></a></p>
Will Hopkins 🌈📸Data eng, Spark, GPU
pipTrends<p>If you don’t want to set up a separate database server for vector search, you can use SQLite or DuckDB with vector extension. In this article, Max Gabrielsson explained how to do it using DuckDB.</p><p><a href="https://duckdb.org/2024/05/03/vector-similarity-search-vss.html" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">duckdb.org/2024/05/03/vector-s</span><span class="invisible">imilarity-search-vss.html</span></a></p><p><a href="https://mastodon.social/tags/python" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>python</span></a> <a href="https://mastodon.social/tags/Programming" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Programming</span></a> <a href="https://mastodon.social/tags/PythonProgramming" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>PythonProgramming</span></a> <a href="https://mastodon.social/tags/DataScience" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataScience</span></a> <a href="https://mastodon.social/tags/MachineLearning" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>MachineLearning</span></a> <a href="https://mastodon.social/tags/SoftwareDevelopment" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SoftwareDevelopment</span></a> <a href="https://mastodon.social/tags/WebDevelopment" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>WebDevelopment</span></a> <a href="https://mastodon.social/tags/TechNews" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>TechNews</span></a> <a href="https://mastodon.social/tags/OpenSource" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenSource</span></a> <a href="https://mastodon.social/tags/DataEngineering" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataEngineering</span></a></p>
InfoQ<p>📣 Submissions are officially open for <a href="https://techhub.social/tags/InfoQ" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>InfoQ</span></a>'s annual article writing competition!</p><p>🎟️ Top authors win <a href="https://techhub.social/tags/FreeTickets" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>FreeTickets</span></a> to industry-leading conferences like QCon &amp; InfoQ Dev Summit – the perfect chance to learn, network, and grow! </p><p>🔗 Get all the details and submit your entry here: <a href="https://bit.ly/43kPTgz" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">bit.ly/43kPTgz</span><span class="invisible"></span></a> </p><p><a href="https://techhub.social/tags/QConLondon" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>QConLondon</span></a> <a href="https://techhub.social/tags/QConSanFrancisco" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>QConSanFrancisco</span></a> <a href="https://techhub.social/tags/InfoQDevSummit" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>InfoQDevSummit</span></a> </p><p><a href="https://techhub.social/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> <a href="https://techhub.social/tags/ML" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ML</span></a> <a href="https://techhub.social/tags/DataEngineering" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataEngineering</span></a> <a href="https://techhub.social/tags/SoftwareArchitecture" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SoftwareArchitecture</span></a> <a href="https://techhub.social/tags/DevOps" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DevOps</span></a> <a href="https://techhub.social/tags/CloudComputing" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>CloudComputing</span></a></p>
Ben Lorica 罗瑞卡<p>Traditional data processing systems are proving inadequate for handling the complex requirements of AI workloads, particularly in managing heterogeneous computational resources and processing multimodal data<br><a href="https://indieweb.social/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> <a href="https://indieweb.social/tags/dataengineering" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dataengineering</span></a> <a href="https://indieweb.social/tags/dataengineer" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dataengineer</span></a><br><a href="https://gradientflow.substack.com/p/paradigm-shifts-in-data-processing" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">gradientflow.substack.com/p/pa</span><span class="invisible">radigm-shifts-in-data-processing</span></a></p>
RJ Nowling<p>As part of a class project, I'm having students read raw events from one table, clean them, and insert into another table using a batch jobs that runs periodically. To identify which raw events have been processed but were dropped, I have them use a processed_events table that only has an id column. Unprocessed events can be found with SELECT ... FROM raw_events WHERE raw_events.id NOT IN (SELECT id FROM processed_events). This is an enables an append-only approach. <a href="https://mastodon.social/tags/SQL" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SQL</span></a> <a href="https://mastodon.social/tags/DataEngineering" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataEngineering</span></a></p>
Sarah Lea<p>The traditional ETL-Process consist of Extract-Transform-Load. But in tools like Data Cloud from Salesforce there is now the Zero-ETL technology integrated: Instead of requiring these 3 traditional steps, data should now flow seamlessly between different systems. </p><p>So, what's new? The data from different systems can be used almost in real-time. There is no need to move data :blobcoffee: <a href="https://towardsdatascience.com/why-etl-zero-understanding-the-shift-in-data-integration-as-a-beginner-d0cefa244154" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">towardsdatascience.com/why-etl</span><span class="invisible">-zero-understanding-the-shift-in-data-integration-as-a-beginner-d0cefa244154</span></a></p><p><a href="https://techhub.social/tags/DataEngineering" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataEngineering</span></a> <a href="https://techhub.social/tags/data" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>data</span></a> <a href="https://techhub.social/tags/datascience" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>datascience</span></a> <a href="https://techhub.social/tags/database" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>database</span></a> <a href="https://techhub.social/tags/salesforce" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>salesforce</span></a> <a href="https://techhub.social/tags/DataIntegrationDatabasesEtl" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataIntegrationDatabasesEtl</span></a> <a href="https://techhub.social/tags/etl" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>etl</span></a> <a href="https://techhub.social/tags/python" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>python</span></a></p><p>Comment for the friend link of the Medium article and I will send you the link in a message.</p>
Clemens Vasters 🇪🇺<p>Streamifying Reference Data for Temporal Consistency with Telemetry Events <a href="https://mastodon.online/tags/cloudevents" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>cloudevents</span></a> <a href="https://mastodon.online/tags/dataengineering" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dataengineering</span></a></p><p><a href="https://vasters.com/clemens/2024/10/30/streamifying-reference-data-for-temporal-consistency-with-telemetry-events" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">vasters.com/clemens/2024/10/30</span><span class="invisible">/streamifying-reference-data-for-temporal-consistency-with-telemetry-events</span></a></p>