Mente

data_engineering.htm

Overview of Data Engineering

lifecycle

Data Engineering Teams

The data engineer is a hub between data producers, such as software engineers, data architects, and DevOps or site-reliability engineers (SREs), and data consumers, such as data analysts, data scientists, and ML engineers. In addition, data engineers will interact with those in operational roles, such as DevOps engineers.

ML Engineers vs. Data Engineer

The ML engineer overlaps with DE, but it develops more advanced ML techniques, train models, and designs and maintains infrastructure running ML processes. It emphasizes more MLOps and other mature practices such as DevOps.

Main components of Data Engineering

Data Generation

Storage

Ingestion

Key Questions:

Transformation

Serving

Data Ops

Observability and Monitoring

Data Architecture

Types of data architectures

Modern Data Stack

Data Generation

Data Logs

Messages and Streams

Types of time

Ways of ingesting data

Ingestion undercurrents

Storage

The raw ingredients are: disk drives, memory, networking and CUPU, serialization, compression and caching

Raw Ingredients

caching

Data Storage Systems

BASIC

Why do BASIC? It allows us to use large-scale distributed systems. i.e., scale horizontally.

When do we do strong consistency? When we can tolerate longer query times but want the correct data every time.

Types of file storage

Data abstractions

Data Ingestion

It's the process of moving data from one place to the other. A data pipeline is the combination of architecture, systems, and processed that move data through the stages of the DE lifecycle.

Considerations:

Ways to ingest data

Undercurrent of data ingestion

Data Transformation

The lifetime of a SQL query is:

A data model represents the way data relates to the real world. How it must be organized to reflect the organization's processes, definitions, workflow and logic.

Types of normalization (Codd normal forms)

Data modeling

Update patterns

Serving data

What a data engineer should know about ML:

New trends in data engineering