Applied GenAI

A program for those interested in developing towards genAI.
During the first semester, we focus on the essential fundamentals, such as basic concepts of math and programming, which are the same for all programs.
In the Math & Statistic course, students will learn about discrete mathematics and the basics of probability theory. Algorithms & Data Structures covers the basic algorithms on graphs, data structures, and recurrence relations. Also, during the first semester, students learn how to process and analyze data, and build simple models based on this data in the Data Science course. In the Advanced Programming Languages course, students will learn about programming languages and their fundamental differences.
During the second semester, in the Applied GenAI course, students will learn how to design and develop AI-powered solutions, apply generative AI tools in practice, and create a final project that integrates the skills and technologies they have mastered. Data Science (part 2) will give an insight into machine learning, where students will try some of the most popular concepts. In the Real-Time Backend course, students will be offered a real-life project example involving work with large amounts of data. In the System Architecture course, students will get familiar with the main concepts of systems architecture.
Semester 1
Math & Statistics
Set Theory
Relations and Functions
Logic and Induction
Sequences
Number Theory
Combinatorics
Recurrence relations
Counting techniques
Graph Theory
Graphs and Numbers
Growth of Functions
The Probability of an Event
Discrete Random Variables
Continuous Probability
Conditional Probability
Distributions and Approximation
Sampling Theory
Statistics
Advanced Algebra
Algorithms and Data Structures
Sorting and Searching
Graphs (basics)
BFS and DFS
Dynamic Programming
LIS, LCS and other 2D problems
Heaps
Dijkstra, Floyd, Bellman-Ford
DFS applications
Knapsack problem
Dynamic programming over subsets
Amortized Time Complexity (Queue via Stack)
Disjoint Set Union
Dynamic programming on a tree
Data Structures. RSQ/RMQ, Sqrt Decomposition, Sparse Table
Minimum Spanning Trees
Matching
Segment Trees
Trie
Binary Search Trees
LCA
What is NP and how is it useful?
Strings
Data Science
AI Introduction
Machine Learning Basics
ML as a task of optimization
Hyperparameters
Probabilistic Approach in ML
Models Validation
Linear regression
Logistic regression
Decision tree
SVM algorithm
Naive Bayes algorithm
KNN algorithm
K-means
Random forest algorithm
Dimensionality reduction algorithms
Gradient boosting algorithm and AdaBoosting algorithm
Data Preprocessing
Time Series Forecasting
Ranking task
Advanced Programming Languages
The dawn of programming
Grammars
Programming Language Spectrum
Programming paradigms
Programming language syntax
ANTLR
Compilation and interpretation
Binding and memory management
Mutable vs. immutable data structures
Data-Oriented Programming
LLVM
Linking
Introduction to Clojure
Clojure macros
Languages for data
Constraint programming
Lua
Elixir
Learning and coding with LLMs
Semester 2
Applied GenAI
Introduction to Generative AI
LLM Creation Pipeline – Data Preparation & Tokenization
LLM Creation Pipeline – Architectures & Mitigating Hallucinations
Prompt Engineering & Retrieval-Augmented Generation (RAG)
Project 1 Kickoff: Building a Text Generation App
Project 1 Implementation: Enhancing the Text Generation App
Evaluating Text Generation Outputs
Introduction to RLHF & DPO
Fine-Tuning with RLHF/DPO
Project 2 Kickoff: Building an Interactive Conversational Agent
Project 2 Implementation: Enhancing the Interactive Agent
Introduction to Image Generative Models & Data Augmentation
Project 3: Hands-On with Image Generation for Data Augmentation
Ethical Considerations, Safety, & Future Trends in Generative AI
Final Capstone Project Walkthrough & Wrap-Up
Real-Time Backend (Architecture)
CAP. MapReduce
Storage. GFS
RPC. Models. Fault tolerance
Physical & logical time, clocks, ordering of events
Broadcast protocols
Consensus and transactions in distributed systems
Election
Consensus
FLP theorem
Raft algorithm
State machine replication
Distributed transactions
Atomic commit protocols
2-phase commit
Distributed File System (DFS)
Industrial Systems Design & System Design. Principles and main concepts. Technology overview
TinyUrl/Pastebin Design Design of industrial systems
Netflix/Youtube architecture
Development of a group project together with a team from the Real-Time Frontend specialty
Systems Architecture
C++ - syntax, OOP basics, UB, numbers
Rust - syntax, programming paradigms, safe/unsafe, comparison with C++
Memory Management. Stack and heap memory, variable sizes, Ownership, smart pointers
Memory Management. Core containers, iterators, internal implementations. Error handling
Syscalls, Processes, Scheduling
System limits - rlimit, cgoups, Linux namespaces, seccomp
Multithreading. Mutex, atomic operations. Condition variables, channels
Multithreading. Async functions, coroutines, green threads
Observability. Metrics, Prometheus, Grafana. Logging, telemetry, alerts
Distributed Systems
Data encoding: UTF-8, big-endian vs. little-endian, prefix encoding, possibly Huffman trees
Deserialization: JSON, Protobuf, etc.
Traffic balancing, Nginx
Databases, message queues
Assembler (Instructions, RISC, CISC, x86, Intel, asm basics)
Data Science (part 2)
Big Data, Hadoop, Spark
Artificial Neuron Model
Multilayer ANN 1: Hyperparameters, Regularization, Training Process
Multilayer ANN 2: Adam, Dropout, Weight Initialization, Batch Normalization
Convolutional Neural Networks (CNN)
Recurrent Neural Networks (RNN) – LSTM, GRU
Neural Network Architectures – GAN, Seq2Seq, Autoencoder
Attention and Transformers
Generative AI: Principles and Types of Models
Generative AI: Prompt Engineering, Fine-Tuning, LLM Inference
Approaches to Building RAG Based on LLM
AI Ethics