Homework 5: Data Platforms for Data Engineering Zoomcamp 2026

https://github.com/DataTalksC…

Due date: 2026-02-28T23:59:59+00:00 (local time)

Please log in to access this homework.

Questions

Question 1. Bruin Pipeline StructureIn a Bruin project, what are the required files/directories? (1 point)

Question 2. Materialization Strategies You're building a pipeline that processes NYC taxi data organized by month based on pickup_datetime. Which incremental strategy is best for processing a specific interval period by deleting and inserting data for that time period? (1 point)

Question 3. Pipeline VariablesYou have a variable defined in pipeline.yml:variables: taxi_types: type: array items: type: string default: ["yellow", "green"]How do you override this when running the pipeline to only process yellow taxis? (1 point)

Question 4. Running with DependenciesYou've modified the ingestion/trips.py asset and want to run it plus all downstream assets. Which command should you use? (1 point)

Question 5. Quality Checks. You want to ensure the pickup_datetime column in your trips table never has NULL values. Which quality check should you add to your asset definition? (1 point)

Question 6. Lineage and DependenciesAfter building your pipeline, you want to visualize the dependency graph between assets. Which Bruin command should you use? (1 point)

Question 7. Question 7. First-Time RunYou're running a Bruin pipeline for the first time on a new DuckDB database. What flag should you use to ensure tables are created from scratch? (1 point)

Please provide a valid URL.

Please log in to access this homework.

Status: Not submitted