posh the engine.

226

gregory.tech@gregorytechKE·7 Nis

ZXX

103

46

482

172.1K

posh the engine.@poshlovesdata·4 Nis

@jickson234 weldone bro

English

Hi everyone blessed day 🙏 I started late yesterday, so I ended up working through the night trying to migrate my Python script to the cloud (as I mentioned in my last post). Ran into some real challenges. When I deployed the script and tried to run it, I kept getting a “localhost” error. Then it clicked, my PostgreSQL database is running locally, so once the script is in the cloud, it can’t access my local machine. So I made a shift: Decided to move my database online. I explored cloud RDBMS options and chose Supabase (very powerful platform built on PostgreSQL). But… another challenge. When I tried connecting my Python script to Supabase and ingesting data from the API, I ran into a host connection error, something I didn’t face locally. Spent time debugging but couldn’t resolve it before calling it a night. Not going to lie, it was frustrating. But it’s a new day. Today’s goal: Fix the connection issue, get the pipeline running in the cloud, and complete this workflow. Not giving up now 😎 @Smanmalik83 I think I’m starting to understand the challenges you mentioned.

0

1

4

OLAJIDE@jickson234·3 Nis

Good morning world, blessed day 🙏 Progress update on my data workflow… I’ve successfully moved both my Python script and database to the cloud. After exploring different options, I settled on Supabase for the database. Most cloud services require payment, so I had to be intentional with my setup. My script is now hosted on GitHub and already scheduled to run every hour, pulling live data from the API straight into my database. Also expanded the dataset: It’s no longer just Lagos, I’ve added more cities. I’ll let the data run for a few days to build volume, then connect it to Power BI for visualization. Honestly, I’m proud of this one. Moving forward, all my personal projects will use live data pipelines, not static datasets.😎😎😎 #Datafam #SQL #DataPipeline #ETL #Automation @chidirolex

OLAJIDE@jickson234

English

5

2

14

527

posh the engine.@poshlovesdata·30 Mar

@sinennn000 real piercing when ? 🌚

English

0

10

__sinennn@sinennn000·29 Mar

in other news, i found my silver earrings while clearing out my room 🥹

English

0

12

141

posh the engine.@poshlovesdata·30 Mar

@sinennn000 activvvv

English

0

7

__sinennn@sinennn000·30 Mar

@poshlovesdata in two weeks, lemme finish this fucking exams 😔

English

0

11

__sinennn@sinennn000·29 Mar

oh and i made it possible to reply to songs on chen 🙂‍↕️

English

0

15

99

posh the engine.@poshlovesdata·29 Mar

@jickson234 you welcome bro

English

1

6

OLAJIDE@jickson234·29 Mar

@poshlovesdata Thank you for this, I will keep it in mind🙏

English

I’ve mostly worked with static datasets in Excel and Power BI. Now I’m taking the next step, working with real-time data. My next focus: API → Python → SQL → Power BI Instead of downloading datasets, I want to: - Pull data directly from APIs - Clean and transform it using Python - Store and query it with SQL - Build dashboards in Power BI The goal is to move closer to how data is actually handled in real-world scenarios. Still learning, but excited to build this end-to-end workflow. If you’ve worked with APIs before, I’d appreciate any tips or resources. @chidirolex @_VictorUgwu @Smanmalik83 @iam_daniiell @ObohX #DataAnalytics #PowerBI #Python #SQL

0

2

17

posh the engine.@poshlovesdata·28 Mar

nice architectural plan bro 👏 some advice i can give regarding this: when you pull from an external api, your first move should always be dumping the raw response into an object store like s3 or even a local json file and why is that ? tbh if you try to clean the data on the fly and your logic fails on record 9,999 out of 10,000, that means you have to ping their api server again, doing that on a paid api is going to cost you money because you're repeatedly hitting their server everytime an error comes up in your code. so always decouple your extraction from your transformation. get the data safe on your storage first, then run your transformation. extract -> load raw -> transform

OLAJIDE@jickson234

English

2

6

203

posh the engine.@poshlovesdata·28 Mar

@humayun_x fr, a model is only as good as the data feeding it anyway

English

0

2

19

Humayun@humayun_x·28 Mar

Everyone talking AI models No one talking data pipelines

English

0

1

19

ABC@Ubunta·28 Mar

You should never vibe code mission critical Data Engineering applications. - Not the pipeline that feeds your regulatory submission. - Not the transformation that calculates patient dosing. - Not the reconciliation logic your finance team signs off on. Use AI to build it. Absolutely. But do not use AI to review it for you. That's the human expert's job — and it's non-negotiable. The code runs. The tests pass. The output looks plausible. That's the danger. Let AI accelerate the build. But the review? That's where domain expertise earns its keep.

English

3

4

1.5K

posh the engine.@poshlovesdata·28 Mar

@Ubunta ran into this exact thing building a flight ticketing pipeline. used AI to write some dbt tests for NUC amounts based on standard industry logic. tests passed, but the company’s internal logic was different. silent failures are the worst 😭

English

25

posh the engine.@poshlovesdata·28 Mar

@JDataCraft @DabereNnamani @Dare_0x @TDataImmersed @ThePSF @JA_Olaoye well done bro 👏🏾

English

0

1

27

Joshua@JDataCraft·28 Mar

I feel like posting what I do is probably one of the hardest parts of building and growing 😂 It's not so easy to find the time and right words. But this week, I worked with Pandas groupby to find patterns within categories and uncover insights hidden across thousands of rows.

English

3

1

8

332

posh the engine.@poshlovesdata·28 Mar

@temivalentine_ good stuff 👏🏾 design’s sleek for a streamlit app

English

0

1

31

Temiloluwa Valentine@temivalentine_·27 Mar

I deployed my ASL sign language detector as a web app. No webcam needed. Just upload a photo of your hand and it detects the sign. 98.43% accuracy. Works on dark skin tones. Try it here 👉🏾 owo-ai.streamlit.app Let me know which letters it gets right for you!

English

18

37

309

6.3K

posh the engine.@poshlovesdata·27 Mar

how do I tell you that airbuds just removed the restriction on Nigerians. This guys are crazyyy

__sinennn@sinennn000

i need you guys to promise me something this thing’s hard to build. like really hard. and yes it’s gonna be completely free, idc. i just need y’all to promise me you’re gonna use it and put your friends on and use it. not download and keep, actually use and post about it thanks

English

0

1

104

posh the engine. retuiteado

Mide💜@_midee1·25 Mar

i’m a Social Media Manager and Data Analyst currently open to remote opportunities. i’d appreciate any referrals or recommendations.🫢

English

5

14

544

posh the engine.@poshlovesdata·21 Mar

Phase 3:🥳 To ensure data is being uploaded to where it can be accessed from any location, once an internet connection comes up at the remote health clinic. What did i do? > Provisioned an S3 bucket for the data uploads > Created a python script that checks for internet connection every 10 seconds > Once a connection is available, it pushes the parquet file from the outbox/ folder to the S3 bucket. > Then moves the file from outbox/ to uploaded/ locally once its sure that the data is now available on the S3 bucket. And there you have it, a simple, reliable offline-first data pipeline that works even with intermittent connectivity. Will be documenting this project and pushing it to GitHub next. If you're interested in the workflow, you can access it there.

Phase 2:👀 To prepare the data in the SQLite DB to be sent in a compressed format (parquet), what did I do? 1. Created a python script that : > Checks the table for unsynced records (records where the sync_status is in 'pending') > Converts just those records into parquet using Pandas > Then updates the converted records (using the record_id) sync_status to 'synced' in the DB. Why did I do this? So, when the script runs again, it only looks for records that haven't been synced yet. Now I have lightweight data that can be sent over a minimal internet connection. What's next? > I'll be creating an uploader script that polls for an internet connection every 10 seconds. Once a connection is confirmed, it uploads the parquet file into an already provisioned S3 bucket. #DataEngineering #ETL #Python #DataAnalytics

English

2

3

127

posh the engine. retuiteado

Kaxil Naik@kaxil·19 Mar

The Apache Airflow Registry is live: a searchable catalog of 98 providers and 1,600+ modules (operators, hooks, sensors, triggers, transfers). Cmd+K instant search, connection builder, JSON API, auto-updates on new releases. airflow.apache.org/registry/

English

6

36

2.3K

posh the engine.@poshlovesdata·19 Mar

@_justmba crazy translation 😭😂

English

8

Mba.tech🦀@_justmba·19 Mar

someone built google translate for linkdin posts🤣

English

0

2

81

posh the engine.@poshlovesdata·19 Mar

phase 2: x.com/poshlovesdata/…

Phase 2:👀 To prepare the data in the SQLite DB to be sent in a compressed format (parquet), what did I do? 1. Created a python script that : > Checks the table for unsynced records (records where the sync_status is in 'pending') > Converts just those records into parquet using Pandas > Then updates the converted records (using the record_id) sync_status to 'synced' in the DB. Why did I do this? So, when the script runs again, it only looks for records that haven't been synced yet. Now I have lightweight data that can be sent over a minimal internet connection. What's next? > I'll be creating an uploader script that polls for an internet connection every 10 seconds. Once a connection is confirmed, it uploads the parquet file into an already provisioned S3 bucket. #DataEngineering #ETL #Python #DataAnalytics

English

34

posh the engine.@poshlovesdata·17 Mar

Proof of Concept:😶‍🌫️ So i started working on this as a project, what I've done: 1. Initialized an SQLite Database and created a table called patient_vitals to store patients vitals. 2. Built a Python script to generate and load data into the SQLite database 3. Automated the data generation and loading using cronjob, which currently runs every minute (would eventually change it to 10 mins) to simulate actual data entry in an health facility. What next? To prepare the data into highly compressed payloads (parquet) so it's ready the millisecond internet is available in the health facility.

I'll definitely use an offline first approach, and here's how I'd do it: 1. Store the data locally in a light weight DB like Sqlite. 2. Run a cron job to regularly batch and compress that data into Parquet. 3. Another script that polls for internet connection, so once an Internet connection comes up, it uploads the parquet into remote storage like an S3 Bucket 4. From there, S3 event notifications can trigger the ingestion pipeline to deduplicate and model the data.

English

1

10

1.8K

posh the engine.@poshlovesdata·19 Mar

Phase 2:👀 To prepare the data in the SQLite DB to be sent in a compressed format (parquet), what did I do? 1. Created a python script that : > Checks the table for unsynced records (records where the sync_status is in 'pending') > Converts just those records into parquet using Pandas > Then updates the converted records (using the record_id) sync_status to 'synced' in the DB. Why did I do this? So, when the script runs again, it only looks for records that haven't been synced yet. Now I have lightweight data that can be sent over a minimal internet connection. What's next? > I'll be creating an uploader script that polls for an internet connection every 10 seconds. Once a connection is confirmed, it uploads the parquet file into an already provisioned S3 bucket. #DataEngineering #ETL #Python #DataAnalytics

Proof of Concept:😶‍🌫️ So i started working on this as a project, what I've done: 1. Initialized an SQLite Database and created a table called patient_vitals to store patients vitals. 2. Built a Python script to generate and load data into the SQLite database 3. Automated the data generation and loading using cronjob, which currently runs every minute (would eventually change it to 10 mins) to simulate actual data entry in an health facility. What next? To prepare the data into highly compressed payloads (parquet) so it's ready the millisecond internet is available in the health facility.

English