Redha Cherif

49 posts

Redha Cherif

Redha Cherif

@redha_rc

Senior analytics engineer | Data trainer | Speaker | Data blogger

A free data newsletter 👉 Katılım Ekim 2024
14 Takip Edilen0 Takipçiler
Redha Cherif
Redha Cherif@redha_rc·
Growing fast as analytics engineer has almost nothing to do with degrees your earned. It's about always learning, being ambitious and going after it with all your energy. If you're able to work hard enough on a long period of time, you'll be able to become the best and outperform anyone with the best degrees out there. #career #dataengineering #analyticsengineering
English
0
0
0
9
Redha Cherif
Redha Cherif@redha_rc·
𝗪𝗵𝗮𝘁 𝗜 𝗼𝗯𝘀𝗲𝗿𝘃𝗲𝗱 𝘄𝗵𝗲𝗻 𝗜 𝘀𝘁𝗮𝗿𝘁𝗲𝗱 𝘁𝗼 𝘄𝗼𝗿𝗸 𝗮𝘀 𝗮 𝗱𝗮𝘁𝗮 𝗮𝗻𝗮𝗹𝘆𝘀𝘁 ? When I first started to work as a data analyst, I came from a research background on computational chemistry. I thought I'd be the only one coming from a non-computer science or big data background. Actually, I was totally wrong. Most of my colleagues were coming from various backgrounds and they have entered the data & AI industry after doing a bootcamp in data analysis. Takeaway: the background you come from doesn't matter to get a data position. The main is to answer this question: "𝗵𝗼𝘄 𝗺𝘆 𝗰𝘂𝗿𝗿𝗲𝗻𝘁 𝘀𝗸𝗶𝗹𝗹𝘀 𝗰𝗼𝘂𝗹𝗱 𝘁𝗿𝗮𝗻𝘀𝗹𝗮𝘁𝗲 𝗶𝗻𝘁𝗼 𝘁𝗵𝗲 𝗱𝗮𝘁𝗮 & 𝗔𝗜 𝗶𝗻𝗱𝘂𝘀𝘁𝗿𝘆 ?" Nothing fancy. All meaningful. #career #data #ai
English
0
0
0
9
Redha Cherif
Redha Cherif@redha_rc·
𝗙𝗼𝗿 𝗮𝗻𝗮𝗹𝘆𝘁𝗶𝗰𝘀 𝗲𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝘀 𝘄𝗵𝗼 𝘄𝗮𝗻𝘁 𝘁𝗼 𝘂𝗻𝗱𝗲𝗿𝘀𝘁𝗮𝗻𝗱 𝘄𝗵𝗶𝗰𝗵 𝗸𝗲𝘆 𝗯𝗲𝗻𝗲𝗳𝗶𝘁𝘀 𝘁𝗵𝗲 𝗮𝗰𝗾𝘂𝗶𝘀𝗶𝘁𝗶𝗼𝗻 𝗼𝗳 𝗦𝗗𝗙 𝗹𝗮𝗯𝘀 𝗯𝗿𝗶𝗻𝗴𝘀 𝗳𝗼𝗿 𝘁𝗵𝗲𝗺: - The ability to compile SQL queries locally, bypassing the data warehouse compiler. The query execution speed could increase by up to 7x, leading to significant time and cost savings. - The ability to obtain column–level lineage, which will be powerful for data governance and access management. - The state method will be automatically included when models are run, meaning only modified models will be executed with the 𝘥𝘣𝘵 𝘳𝘶𝘯 command! No more need to use the more complicated command: 𝘥𝘣𝘵 𝘳𝘶𝘯 —𝘴𝘦𝘭𝘦𝘤𝘵 𝘴𝘵𝘢𝘵𝘦:𝘮𝘰𝘥𝘪𝘧𝘪𝘦𝘥+ —𝘴𝘵𝘢𝘵𝘦:𝘱𝘢𝘵𝘩/𝘵𝘰/𝘮𝘢𝘯𝘪𝘧𝘦𝘴𝘵.𝘫𝘴𝘰𝘯 - A built-in linter will be automatically implemented, faster than sqlfluff. The commands dbt lint and dbt lint --fix should be used to lint and fix your code. #dbt #SDF #analyticsengineering #dataengineering
Redha Cherif tweet media
English
0
0
0
6
Redha Cherif
Redha Cherif@redha_rc·
𝗗𝗮𝘁𝗮 𝗽𝗶𝗽𝗲𝗹𝗶𝗻𝗲𝘀 𝗯𝗲𝗳𝗼𝗿𝗲 𝗮𝗻𝗱 𝗮𝗳𝘁𝗲𝗿 𝘁𝗵𝗲 𝗶𝗻𝘁𝗿𝗼𝗱𝘂𝗰𝘁𝗶𝗼𝗻 𝗼𝗳 𝗱𝗯𝘁 𝗼𝗻 𝘁𝗵𝗲 𝗺𝗮𝗿𝗸𝗲𝘁 𝗶𝗻 𝟮𝟬𝟭𝟲 Before 2016: • monolithic data pipelines • very long SQL scripts -> very expensive as pipelines were build from scratch every time -> data quality issues induced by data discrepancies for the same KPI in different dashboards (lack of confidence of business stakeholders) -> high complexity to understand bugs when production breaks After 2016: • modularity concept with staging, intermediate and mart models • software engineering practises for data pipelines -> easier to maintain -> reusability of models -> version control for team collaboration #dbt #analyticsengineering #dataengineering
English
0
0
0
1
Redha Cherif
Redha Cherif@redha_rc·
𝗠𝗮𝗻𝘆 𝗮𝗻𝗮𝗹𝘆𝘁𝗶𝗰𝘀 𝗲𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝘀 𝗸𝗲𝗲𝗽 𝗯𝘂𝗶𝗹𝗱𝗶𝗻𝗴 𝘁𝗼𝗼 𝗰𝗼𝗺𝗽𝗹𝗲𝘅 𝗱𝗮𝘁𝗮 𝗽𝗶𝗽𝗲𝗹𝗶𝗻𝗲𝘀 ! 🚨 Why ? Because they forget about the key concept of dbt: 𝗺𝗼𝗱𝘂𝗹𝗮𝗿𝗶𝘁𝘆. Modularity is the concept of building your final tables as 𝗶𝗻𝗱𝗲𝗽𝗲𝗻𝗱𝗲𝗻𝘁 𝗺𝗼𝗱𝘂𝗹𝗲𝘀. This induces lots of benefits as: • 𝗥𝗲𝘂𝘀𝗮𝗯𝗶𝗹𝗶𝘁𝘆 -> No need to start over from the source every time • 𝗖𝗼𝗹𝗹𝗮𝗯𝗼𝗿𝗮𝘁𝗶𝗼𝗻 𝗮𝗻𝗱 𝘁𝗲𝗮𝗺 𝗲𝗳𝗳𝗶𝗰𝗶𝗲𝗻𝗰𝘆 -> enable teams to work on different parts of the pipeline simultaneously • 𝗘𝗮𝘀𝗶𝗲𝗿 𝗱𝗲𝗯𝘂𝗴𝗴𝗶𝗻𝗴 𝗮𝗻𝗱 𝗺𝗮𝗶𝗻𝘁𝗲𝗻𝗮𝗻𝗰𝗲 -> simpler to understand and fix issues in data pipelines Always be curious and stick to basics. Basics will make you the best. Hope it finds you well. #dbt #analyticsengineering #dataengineering
English
0
0
0
4
Redha Cherif
Redha Cherif@redha_rc·
Best practises: - For staging models, use the prefix stg_ - For intermediate models, use the prefix int_ - Materialise staging and intermediate models as ephemeral models. - For mart models, use the prefix dim_ for dimensions, fact_ for fact tables and obt_ for One Big Tables
English
0
0
0
10
Redha Cherif
Redha Cherif@redha_rc·
This is your final layer, the tables which will constitute the data mart of your team and most probably the source for your activation layer which will be the direct backend of your dashboards.
Redha Cherif tweet media
English
1
0
0
4
Redha Cherif
Redha Cherif@redha_rc·
Keeping in mind the key concept of dbt which is modularity will be a game-changer when building your data pipelines ! Before dbt was released, monolithic data pipelines were used in the industry meaning very long SQL scripts (10 000+ lines) were used to build table, data marts, datasets. If someone else in the organisation was looking to model similar tables, they were starting over. Indeed, modifying long SQL scripts was more difficult than starting over from source data. This was problematic at various levels: - it is expensive - data discrepancies for the same KPI at two different places - high complexity when production breaks #dbt #careers #dataanalytics #analyticsengineering #dataengineering
English
1
0
0
7
Redha Cherif
Redha Cherif@redha_rc·
𝟮 𝗨𝗱𝗲𝗺𝘆 𝗰𝗼𝘂𝗿𝘀𝗲𝘀 𝗲𝘃𝗲𝗿𝘆𝗼𝗻𝗲 𝘄𝗵𝗼'𝘀 𝘄𝗶𝗹𝗹𝗶𝗻𝗴 𝘁𝗼 𝘁𝗿𝗮𝗻𝘀𝗶𝘁𝗶𝗼𝗻 𝗶𝗻𝘁𝗼 𝗱𝗮𝘁𝗮 𝘀𝗰𝗶𝗲𝗻𝗰𝗲 𝘀𝗵𝗼𝘂𝗹𝗱 𝗸𝗻𝗼𝘄 𝗮𝗯𝗼𝘂𝘁: - The Complete Python Bootcamp From Zero to Hero in Python: buff.ly/4h86zMw - Python for Data Science and Machine Learning Bootcamp buff.ly/3WsSHnL You can get them for cheap while having a very good content ! I started with those 6 years ago. I hope you enjoy them.
English
0
0
0
8
Redha Cherif
Redha Cherif@redha_rc·
𝗗𝗕𝗧 𝗰𝗼𝗺𝗺𝗮𝗻𝗱𝘀 𝘄𝗵𝗶𝗰𝗵 𝗱𝗼 𝗻𝗼𝘁 𝗴𝗲𝗻𝗲𝗿𝗮𝘁𝗲 𝘁𝗵𝗲 𝗺𝗮𝗻𝗶𝗳𝗲𝘀𝘁.𝗷𝘀𝗼𝗻 𝗳𝗶𝗹𝗲, 𝗳𝘂𝗻𝗱𝗮𝗺𝗲𝗻𝘁𝗮𝗹 𝗽𝗶𝗲𝗰𝗲 𝗳𝗼𝗿 𝘀𝗹𝗶𝗺 𝗖𝗜 - 𝗱𝗯𝘁 𝗱𝗲𝗽𝘀 -> allows to install dbt packages - 𝗱𝗯𝘁 𝗰𝗹𝗲𝗮𝗻 -> removes your target and dbt_packages folders and any other unnecessary files or directories generated during the execution of dbt commands - 𝗱𝗯𝘁 𝗱𝗲𝗯𝘂𝗴 -> check if your dbt project is correctly setup (check your profiles.yml, dbt_project.yml, if git is installed and your connection to your data warehouse) 𝗧𝗮𝗸𝗲𝘄𝗮𝘆: the manifest.json is generated by any command which parses your dbt project (build, run, test, seed, snapshot, compile, ...)
English
0
0
0
1
Redha Cherif
Redha Cherif@redha_rc·
𝗚𝗿𝗼𝘄𝗶𝗻𝗴 𝗳𝗮𝘀𝘁 𝗶𝗻 𝗱𝗮𝘁𝗮 𝗮𝗻𝗮𝗹𝘆𝘁𝗶𝗰𝘀 𝗵𝗮𝘀 𝗮𝗹𝗺𝗼𝘀𝘁 𝗻𝗼𝘁𝗵𝗶𝗻𝗴 𝘁𝗼 𝗱𝗼 𝘄𝗶𝘁𝗵 𝗱𝗮𝘁𝗮 𝗱𝗲𝗴𝗿𝗲𝗲𝘀 It's about finding the way your skills can translate into the industry and being curious enough to always keep learning as it is a very-fast evolving area. Then add: - very hard working - soft-skills development - confidence Correct game-plan + obsession = results
English
0
0
0
1
Redha Cherif
Redha Cherif@redha_rc·
𝗪𝗵𝗮𝘁 𝗜 𝗼𝗯𝘀𝗲𝗿𝘃𝗲𝗱 𝘄𝗵𝗲𝗻 𝗜 𝘀𝘁𝗮𝗿𝘁𝗲𝗱 𝘁𝗼 𝘂𝘀𝗲 𝘀𝗹𝗶𝗺 𝗖𝗜 When I started to use slim CI, I struggled to understand dbt artifacts. I thought I'd be the only one having difficulties to understand this notion. Actually, lots of my colleagues didn't bother understanding artifacts. Always be curious and willing to understand most difficult notions. 𝗧𝗮𝗸𝗲𝗮𝘄𝗮𝘆: dbt artifacts are json files produced when dbt commands are run. These files are stored in the /target folder. It is mainly used for monitoring purposes as it is filled with metadata with the dbt version you're using, status of your job, the environment you executed your job, ... Hope it finds you well.
English
0
0
0
4
Redha Cherif
Redha Cherif@redha_rc·
𝗦𝘁𝗿𝘂𝗴𝗴𝗹𝗶𝗻𝗴 𝘁𝗼 𝘂𝗻𝗱𝗲𝗿𝘀𝘁𝗮𝗻𝗱 𝘄𝗵𝗮𝘁 𝗶𝘀 𝗺𝗼𝗱𝘂𝗹𝗮𝗿𝗶𝘁𝘆 𝗶𝗻 𝗱𝗯𝘁 ? Yesterday, my newsletter subscribers got my article explaining this concept in a simple manner. Missed the issue ? Grab it below ⬇️ buff.ly/3WBGLzX
English
0
0
0
1
Redha Cherif
Redha Cherif@redha_rc·
𝗨𝗻𝗱𝗲𝗿𝘀𝘁𝗮𝗻𝗱𝗶𝗻𝗴 𝘁𝗵𝗲 𝗽𝗼𝘄𝗲𝗿 𝗼𝗳 𝗺𝗼𝗱𝘂𝗹𝗮𝗿𝗶𝘁𝘆 𝘄𝗶𝘁𝗵 𝗱𝗯𝘁 ! Exciting news! My next newsletter article dives into modularity — the core concept dbt introduced to build data pipelines 🎉 Here is what you have to know: • Modularity consists into building a final product using independent modules • Modularity translates into dbt through the use of staging, intermediate and mart models Tomorrow, at 7:30 AM ET, I'll explain my newsletter subscribers this concept and why it is so important to understand it properly. The link here: buff.ly/3WBGLzX
English
0
0
0
8
Redha Cherif
Redha Cherif@redha_rc·
𝗪𝗵𝗮𝘁 𝗜 𝗼𝗯𝘀𝗲𝗿𝘃𝗲𝗱 𝘄𝗵𝗲𝗻 𝗜 𝗱𝗲𝗰𝗶𝗱𝗲𝗱 𝘁𝗼 𝗾𝘂𝗶𝘁 𝗺𝘆 𝗣𝗵𝗗 When I started wondering if I should quit my PhD, I asked other PhD students for advice. I thought I'd be the only full having these weird thoughts. Actually, it was the total opposite. 95% of researchers I talked to was wondering if they should quit every single morning. 𝗧𝗮𝗸𝗲𝗮𝘄𝗮𝘆: when struggling to make decisions, talking to others could simply make you feel "normal". This will help in your decision-making. Nothing fancy. All meaningful.
English
0
0
0
8