Szkoła Główna Handlowa w Warszawie - Centralny System Uwierzytelniania
Strona główna

Real-Time Analytics

Informacje ogólne

Kod przedmiotu: 222891-D
Kod Erasmus / ISCED: (brak danych) / (brak danych)
Nazwa przedmiotu: Real-Time Analytics
Jednostka: Szkoła Główna Handlowa w Warszawie
Grupy: Elective courses for QEM - masters
Major courses for AAB - masters
Przedmioty kierunkowe do wyboru SMMD-EKO
Przedmioty obowiązkowe na programie SMMD-ADA
Punkty ECTS i inne: 3.00 (zmienne w czasie) Podstawowe informacje o zasadach przyporządkowania punktów ECTS:
  • roczny wymiar godzinowy nakładu pracy studenta konieczny do osiągnięcia zakładanych efektów uczenia się dla danego etapu studiów wynosi 1500-1800 h, co odpowiada 60 ECTS;
  • tygodniowy wymiar godzinowy nakładu pracy studenta wynosi 45 h;
  • 1 punkt ECTS odpowiada 25-30 godzinom pracy studenta potrzebnej do osiągnięcia zakładanych efektów uczenia się;
  • tygodniowy nakład pracy studenta konieczny do osiągnięcia zakładanych efektów uczenia się pozwala uzyskać 1,5 ECTS;
  • nakład pracy potrzebny do zaliczenia przedmiotu, któremu przypisano 3 ECTS, stanowi 10% semestralnego obciążenia studenta.

zobacz reguły punktacji
Język prowadzenia: angielski
Efekty uczenia się:

Wiedza:

Know the history and philosophy of data processing models

Know the types of structured and unstructured data

Know the possibilities and areas of real-time data processing

Know the theoretical aspects of REST API and pub/sub

Be able to choose the IT structure for a given business problem

Understand the business needs of making decisions in a very short time

Umiejętności:

Distinguish between structured and unstructured data types

Be able to prepare, process and save data generated in real time

Understand the limitations arising from time processing by devices and IT systems

Apply and construct a system for real-time processing

Be able to prepare reporting for real-time processing system

Kompetencje społeczne:

Formulate an analytical problem along with its IT solution

Consolidate the ability of independent supplementing theoretical and practical knowledge in the field of programming, modelling, new information technologies using real-time analysis.

Zajęcia w cyklu "Semestr letni 2025/26" (jeszcze nie rozpoczęty)

Okres: 2026-02-21 - 2026-09-30
22

Wybrany podział planu:
Przejdź do planu
Typ zajęć:
Laboratorium, 20 godzin więcej informacji
Wykład, 10 godzin więcej informacji
Koordynatorzy: (brak danych)
Prowadzący grup: Sebastian Zając
Lista studentów: (nie masz dostępu)
Zaliczenie: Przedmiot - Ocena
Wykład - Ocena
Skrócony opis:

1. From Flat Files to Data Mash: Data Processing Models in Big Data.

2. ETL and Batch (Offline Learning) and Incremental (Online Learning) Modeling. Map-Reduce.

3. Data Streams, Events, and Time and Time Window Concepts in Real-time Data Processing.

4. Microservices and Communication via REST API.

5. Contemporary Architectures for Stream Data Processing Applications - Lambda, Kappa, Pub/Sub.

6. Processing Structured and Unstructured Data. Programming Environment for Python.

7. Utilizing Python Object-Oriented Elements in the Modeling Process with Scikit-Learn and Keras.

8. Python Object-Oriented Programming Basics. Building Classes for Random Walk, Perceptron, and Adeline Algorithms.

9. Preparing a Microservice with an ML Model for Production Use.

10. Streaming Data Using RDDs with Apache Spark. Introduction to the DataFrame Object.

11. Methods for Creating Data Streams Using the DataFrame Object in Apache Spark. Setting Output and Input.

12. S

Pełny opis:

Making informed decisions based on data and its analysis is fundamental in today's modern business world. Modern techniques such as machine learning, artificial intelligence, and deep neural networks can significantly enhance business understanding and decision-making quality. Moreover, the speed of decision-making is crucial in a dynamic business environment, especially when dealing directly with customers. The goal of these classes is to impart students with practical experience and comprehensive theoretical knowledge in real-time data processing and analysis, as well as to introduce the latest information technology for processing structured data (e.g., from data warehouses) and unstructured data (e.g., images, sound, video streaming) online. The philosophy of real-time extensive data analysis using Python programming will be presented during the classes. Software structures for data processing will be introduced, along with discussions of the issues and challenges encountered when modeling large amounts of data in real time. Theoretical knowledge will be gained through hands-on exercises using tools such as Apache Spark and Apache Kafka. In the lab sessions, students will utilize fully configured development environments prepared for data processing, modeling, and analysis, ensuring that, in addition to analytical skills and techniques, they also become familiar with and understand the latest information technology related to real-time data processing.

Literatura:

Literatura podstawowa:

1. Zając S. "Modelowanie dla biznesu. Analityka w czasie rzeczywistym - narzędzia informatyczne i biznesowe. Oficyna Wydawnicza SGH, Warszawa 2022

2. K. Przanowski K. , Zając S. red. "Modelowanie dla biznesu, metody ML, modele portfela CF, modele rekurencyjne, analizy przeżycia, modele scoringowe, SGH, Warszawa 2020.

3. Frątczak E., red. "Modelowanie dla biznesu, Regresja logistyczna, Regresja Poissona, Survival Data Mining, CRM, Credit Scoring". SGH, Warszawa 2019.

4. S. Raschka, Python. Uczenie maszynowe. Wydanie II

5. Maas G., Garillot F. Stream Processing with Apache Spark, O'Reilly, 2021

6. F. Hueske, V. Kalavri Stream Processing with Apache Flink, O'Reilly, 2021

7. Nandi A. "Spark for Python Developers", 2015

Literatura uzupełniająca:

1. Frątczak E., "Statistics for Management & Economics" SGH, Warszawa, 2015

2. Simon P., "Too Big to IGNORE. The Business Case for Big Data", John Wiley & Sons Inc., 2013

3. Frank J. Ohlhorst. "Big Data Analytics. Turning Big Data into Big Money". John Wiley & Sons. Inc. 2013

4. Russell J. "Zwinna analiza danych Apache Hadoop dla każdego", Helion, 2014

5. Todman C., "Projektowanie hurtowni danych, Wspomaganie zarządzania relacjami z klientami", Helion, 2011

6. P. Bruce, A. Bruce, P. Gedeck, "Statystyka praktyczna w data science. 50 kluczowych zagadnień w językach R i Python". Helion, Wydanie II, 2021

Uwagi:

Evaluation criteria

Traditional Written Exam: 0.00%

Multiple Choice Test (MS Teams + Forms): 40.00%

Oral Exam: 0.00%

Test (Realization on labs): 20.00%

Papers/Essays (Preparing a presentation): 40.00%

Other: 0.00%

The threshold percentage of absences (excluding lectures)

defined as the proportion of class hours beyond which the achievement of learning outcomes is deemed unattainable: 50%

Zajęcia w cyklu "Semestr zimowy 2025/26" (w trakcie)

Okres: 2025-10-01 - 2026-02-20
Wybrany podział planu:
Przejdź do planu
Typ zajęć:
Laboratorium, 20 godzin więcej informacji
Wykład, 10 godzin więcej informacji
Koordynatorzy: (brak danych)
Prowadzący grup: (brak danych)
Lista studentów: (nie masz dostępu)
Zaliczenie: Przedmiot - Ocena
Wykład - Ocena
Skrócony opis:

1. From Flat Files to Data Mash: Data Processing Models in Big Data.

2. ETL and Batch (Offline Learning) and Incremental (Online Learning) Modeling. Map-Reduce.

3. Data Streams, Events, and Time and Time Window Concepts in Real-time Data Processing.

4. Microservices and Communication via REST API.

5. Contemporary Architectures for Stream Data Processing Applications - Lambda, Kappa, Pub/Sub.

6. Processing Structured and Unstructured Data. Programming Environment for Python.

7. Utilizing Python Object-Oriented Elements in the Modeling Process with Scikit-Learn and Keras.

8. Python Object-Oriented Programming Basics. Building Classes for Random Walk, Perceptron, and Adeline Algorithms.

9. Preparing a Microservice with an ML Model for Production Use.

10. Streaming Data Using RDDs with Apache Spark. Introduction to the DataFrame Object.

11. Methods for Creating Data Streams Using the DataFrame Object in Apache Spark. Setting Output and Input.

12. S

Pełny opis:

Making informed decisions based on data and its analysis is fundamental in today's modern business world. Modern techniques such as machine learning, artificial intelligence, and deep neural networks can significantly enhance business understanding and decision-making quality. Moreover, the speed of decision-making is crucial in a dynamic business environment, especially when dealing directly with customers. The goal of these classes is to impart students with practical experience and comprehensive theoretical knowledge in real-time data processing and analysis, as well as to introduce the latest information technology for processing structured data (e.g., from data warehouses) and unstructured data (e.g., images, sound, video streaming) online. The philosophy of real-time extensive data analysis using Python programming will be presented during the classes. Software structures for data processing will be introduced, along with discussions of the issues and challenges encountered when modeling large amounts of data in real time. Theoretical knowledge will be gained through hands-on exercises using tools such as Apache Spark and Apache Kafka. In the lab sessions, students will utilize fully configured development environments prepared for data processing, modeling, and analysis, ensuring that, in addition to analytical skills and techniques, they also become familiar with and understand the latest information technology related to real-time data processing.

Literatura:

Literatura podstawowa:

1. Zając S. "Modelowanie dla biznesu. Analityka w czasie rzeczywistym - narzędzia informatyczne i biznesowe. Oficyna Wydawnicza SGH, Warszawa 2022

2. K. Przanowski K. , Zając S. red. "Modelowanie dla biznesu, metody ML, modele portfela CF, modele rekurencyjne, analizy przeżycia, modele scoringowe, SGH, Warszawa 2020.

3. Frątczak E., red. "Modelowanie dla biznesu, Regresja logistyczna, Regresja Poissona, Survival Data Mining, CRM, Credit Scoring". SGH, Warszawa 2019.

4. S. Raschka, Python. Uczenie maszynowe. Wydanie II

5. Maas G., Garillot F. Stream Processing with Apache Spark, O'Reilly, 2021

6. F. Hueske, V. Kalavri Stream Processing with Apache Flink, O'Reilly, 2021

7. Nandi A. "Spark for Python Developers", 2015

Literatura uzupełniająca:

1. Frątczak E., "Statistics for Management & Economics" SGH, Warszawa, 2015

2. Simon P., "Too Big to IGNORE. The Business Case for Big Data", John Wiley & Sons Inc., 2013

3. Frank J. Ohlhorst. "Big Data Analytics. Turning Big Data into Big Money". John Wiley & Sons. Inc. 2013

4. Russell J. "Zwinna analiza danych Apache Hadoop dla każdego", Helion, 2014

5. Todman C., "Projektowanie hurtowni danych, Wspomaganie zarządzania relacjami z klientami", Helion, 2011

6. P. Bruce, A. Bruce, P. Gedeck, "Statystyka praktyczna w data science. 50 kluczowych zagadnień w językach R i Python". Helion, Wydanie II, 2021

Uwagi:

Evaluation criteria

Traditional Written Exam: 0.00%

Multiple Choice Test (MS Teams + Forms): 40.00%

Oral Exam: 0.00%

Test (Realization on labs): 20.00%

Papers/Essays (Preparing a presentation): 40.00%

Other: 0.00%

The threshold percentage of absences (excluding lectures)

defined as the proportion of class hours beyond which the achievement of learning outcomes is deemed unattainable: 50%

Zajęcia w cyklu "Semestr letni 2024/25" (zakończony)

Okres: 2025-02-15 - 2025-09-30
Wybrany podział planu:
Przejdź do planu
Typ zajęć:
Laboratorium, 20 godzin więcej informacji
Wykład, 10 godzin więcej informacji
Koordynatorzy: (brak danych)
Prowadzący grup: Szymon Chudziak, Sebastian Zając
Lista studentów: (nie masz dostępu)
Zaliczenie: Przedmiot - Ocena
Wykład - Ocena
Skrócony opis:

1. From Flat Files to Data Mash: Data Processing Models in Big Data.

2. ETL and Batch (Offline Learning) and Incremental (Online Learning) Modeling. Map-Reduce.

3. Data Streams, Events, and Time and Time Window Concepts in Real-time Data Processing.

4. Microservices and Communication via REST API.

5. Contemporary Architectures for Stream Data Processing Applications - Lambda, Kappa, Pub/Sub.

6. Processing Structured and Unstructured Data. Programming Environment for Python.

7. Utilizing Python Object-Oriented Elements in the Modeling Process with Scikit-Learn and Keras.

8. Python Object-Oriented Programming Basics. Building Classes for Random Walk, Perceptron, and Adeline Algorithms.

9. Preparing a Microservice with an ML Model for Production Use.

10. Streaming Data Using RDDs with Apache Spark. Introduction to the DataFrame Object.

11. Methods for Creating Data Streams Using the DataFrame Object in Apache Spark. Setting Output and Input.

12. S

Pełny opis:

Making informed decisions based on data and its analysis is fundamental in today's modern business world. Modern techniques such as machine learning, artificial intelligence, and deep neural networks can significantly enhance business understanding and decision-making quality. Moreover, the speed of decision-making is crucial in a dynamic business environment, especially when dealing directly with customers. The goal of these classes is to impart students with practical experience and comprehensive theoretical knowledge in real-time data processing and analysis, as well as to introduce the latest information technology for processing structured data (e.g., from data warehouses) and unstructured data (e.g., images, sound, video streaming) online. The philosophy of real-time extensive data analysis using Python programming will be presented during the classes. Software structures for data processing will be introduced, along with discussions of the issues and challenges encountered when modeling large amounts of data in real time. Theoretical knowledge will be gained through hands-on exercises using tools such as Apache Spark and Apache Kafka. In the lab sessions, students will utilize fully configured development environments prepared for data processing, modeling, and analysis, ensuring that, in addition to analytical skills and techniques, they also become familiar with and understand the latest information technology related to real-time data processing.

Literatura:

Literatura podstawowa:

1. Zając S. "Modelowanie dla biznesu. Analityka w czasie rzeczywistym - narzędzia informatyczne i biznesowe. Oficyna Wydawnicza SGH, Warszawa 2022

2. K. Przanowski K. , Zając S. red. "Modelowanie dla biznesu, metody ML, modele portfela CF, modele rekurencyjne, analizy przeżycia, modele scoringowe, SGH, Warszawa 2020.

3. Frątczak E., red. "Modelowanie dla biznesu, Regresja logistyczna, Regresja Poissona, Survival Data Mining, CRM, Credit Scoring". SGH, Warszawa 2019.

4. S. Raschka, Python. Uczenie maszynowe. Wydanie II

5. Maas G., Garillot F. Stream Processing with Apache Spark, O'Reilly, 2021

6. F. Hueske, V. Kalavri Stream Processing with Apache Flink, O'Reilly, 2021

7. Nandi A. "Spark for Python Developers", 2015

Literatura uzupełniająca:

1. Frątczak E., "Statistics for Management & Economics" SGH, Warszawa, 2015

2. Simon P., "Too Big to IGNORE. The Business Case for Big Data", John Wiley & Sons Inc., 2013

3. Frank J. Ohlhorst. "Big Data Analytics. Turning Big Data into Big Money". John Wiley & Sons. Inc. 2013

4. Russell J. "Zwinna analiza danych Apache Hadoop dla każdego", Helion, 2014

5. Todman C., "Projektowanie hurtowni danych, Wspomaganie zarządzania relacjami z klientami", Helion, 2011

6. P. Bruce, A. Bruce, P. Gedeck, "Statystyka praktyczna w data science. 50 kluczowych zagadnień w językach R i Python". Helion, Wydanie II, 2021

Zajęcia w cyklu "Semestr zimowy 2024/25" (zakończony)

Okres: 2024-10-01 - 2025-02-14
Wybrany podział planu:
Przejdź do planu
Typ zajęć:
Laboratorium, 20 godzin więcej informacji
Wykład, 10 godzin więcej informacji
Koordynatorzy: (brak danych)
Prowadzący grup: (brak danych)
Lista studentów: (nie masz dostępu)
Zaliczenie: Przedmiot - Ocena
Wykład - Ocena
Skrócony opis:

1. From Flat Files to Data Mash: Data Processing Models in Big Data.

2. ETL and Batch (Offline Learning) and Incremental (Online Learning) Modeling. Map-Reduce.

3. Data Streams, Events, and Time and Time Window Concepts in Real-time Data Processing.

4. Microservices and Communication via REST API.

5. Contemporary Architectures for Stream Data Processing Applications - Lambda, Kappa, Pub/Sub.

6. Processing Structured and Unstructured Data. Programming Environment for Python.

7. Utilizing Python Object-Oriented Elements in the Modeling Process with Scikit-Learn and Keras.

8. Python Object-Oriented Programming Basics. Building Classes for Random Walk, Perceptron, and Adeline Algorithms.

9. Preparing a Microservice with an ML Model for Production Use.

10. Streaming Data Using RDDs with Apache Spark. Introduction to the DataFrame Object.

11. Methods for Creating Data Streams Using the DataFrame Object in Apache Spark. Setting Output and Input.

12. S

Pełny opis:

Making informed decisions based on data and its analysis is fundamental in today's modern business world. Modern techniques such as machine learning, artificial intelligence, and deep neural networks can significantly enhance business understanding and decision-making quality. Moreover, the speed of decision-making is crucial in a dynamic business environment, especially when dealing directly with customers. The goal of these classes is to impart students with practical experience and comprehensive theoretical knowledge in real-time data processing and analysis, as well as to introduce the latest information technology for processing structured data (e.g., from data warehouses) and unstructured data (e.g., images, sound, video streaming) online. The philosophy of real-time extensive data analysis using Python programming will be presented during the classes. Software structures for data processing will be introduced, along with discussions of the issues and challenges encountered when modeling large amounts of data in real time. Theoretical knowledge will be gained through hands-on exercises using tools such as Apache Spark and Apache Kafka. In the lab sessions, students will utilize fully configured development environments prepared for data processing, modeling, and analysis, ensuring that, in addition to analytical skills and techniques, they also become familiar with and understand the latest information technology related to real-time data processing.

Literatura:

Literatura podstawowa:

1. Zając S. "Modelowanie dla biznesu. Analityka w czasie rzeczywistym - narzędzia informatyczne i biznesowe. Oficyna Wydawnicza SGH, Warszawa 2022

2. K. Przanowski K. , Zając S. red. "Modelowanie dla biznesu, metody ML, modele portfela CF, modele rekurencyjne, analizy przeżycia, modele scoringowe, SGH, Warszawa 2020.

3. Frątczak E., red. "Modelowanie dla biznesu, Regresja logistyczna, Regresja Poissona, Survival Data Mining, CRM, Credit Scoring". SGH, Warszawa 2019.

4. S. Raschka, Python. Uczenie maszynowe. Wydanie II

5. Maas G., Garillot F. Stream Processing with Apache Spark, O'Reilly, 2021

6. F. Hueske, V. Kalavri Stream Processing with Apache Flink, O'Reilly, 2021

7. Nandi A. "Spark for Python Developers", 2015

Literatura uzupełniająca:

1. Frątczak E., "Statistics for Management & Economics" SGH, Warszawa, 2015

2. Simon P., "Too Big to IGNORE. The Business Case for Big Data", John Wiley & Sons Inc., 2013

3. Frank J. Ohlhorst. "Big Data Analytics. Turning Big Data into Big Money". John Wiley & Sons. Inc. 2013

4. Russell J. "Zwinna analiza danych Apache Hadoop dla każdego", Helion, 2014

5. Todman C., "Projektowanie hurtowni danych, Wspomaganie zarządzania relacjami z klientami", Helion, 2011

6. P. Bruce, A. Bruce, P. Gedeck, "Statystyka praktyczna w data science. 50 kluczowych zagadnień w językach R i Python". Helion, Wydanie II, 2021

Opisy przedmiotów w USOS i USOSweb są chronione prawem autorskim.
Właścicielem praw autorskich jest Szkoła Główna Handlowa w Warszawie.
al. Niepodległości 162
02-554 Warszawa
tel: +48 22 564 60 00 http://www.sgh.waw.pl/
kontakt deklaracja dostępności mapa serwisu USOSweb 7.2.0.0