datahub-dine

所属分类:教育系统应用
开发工具:Python
文件大小:0KB
下载次数:0
上传日期:2021-12-16 00:10:17
上 传 者sh-1993
说明:  数据中心交互式教育(DINE)是一个演示内容,展示了如何使用SAP Data hub的功能。
(Data hub INteractive Education (DINE) is a demo content that shows how to consume the features of SAP Data Hub.)

文件列表:
.reuse/ (0, 2021-12-15)
.reuse/dep5 (1884, 2021-12-15)
LICENSE (11358, 2021-12-15)
LICENSES/ (0, 2021-12-15)
LICENSES/Apache-2.0.txt (11358, 2021-12-15)
data/ (0, 2021-12-15)
data/Picture1.png (750790, 2021-12-15)
data/addresses.csv (6067, 2021-12-15)
data/customer.csv (4966, 2021-12-15)
data/images/ (0, 2021-12-15)
data/images/er_diagram.jpg (231825, 2021-12-15)
data/product.csv (22493, 2021-12-15)
data/return.csv (5380, 2021-12-15)
data/reviews.csv (11186, 2021-12-15)
data/soHeader.csv (1256742, 2021-12-15)
data/soItem.csv (2628235, 2021-12-15)
tutorials/ (0, 2021-12-15)
tutorials/customer return prediction/ (0, 2021-12-15)
tutorials/customer return prediction/code snippets/ (0, 2021-12-15)
tutorials/customer return prediction/code snippets/JoiningDataset.py (1784, 2021-12-15)
tutorials/customer return prediction/code snippets/PredictReturn.py (2419, 2021-12-15)
tutorials/customer return prediction/dockerfile (77, 2021-12-15)
tutorials/customer return prediction/images/ (0, 2021-12-15)
tutorials/customer return prediction/images/10.JPG (26738, 2021-12-15)
tutorials/customer return prediction/images/11.JPG (29659, 2021-12-15)
tutorials/customer return prediction/images/12.JPG (58633, 2021-12-15)
tutorials/customer return prediction/images/13.JPG (53799, 2021-12-15)
tutorials/customer return prediction/images/14.JPG (27712, 2021-12-15)
tutorials/customer return prediction/images/15.JPG (34507, 2021-12-15)
tutorials/customer return prediction/images/16.JPG (23982, 2021-12-15)
tutorials/customer return prediction/images/17.JPG (15899, 2021-12-15)
tutorials/customer return prediction/images/19.JPG (22191, 2021-12-15)
tutorials/customer return prediction/images/2.JPG (36228, 2021-12-15)
tutorials/customer return prediction/images/20.JPG (35500, 2021-12-15)
tutorials/customer return prediction/images/21.JPG (32542, 2021-12-15)
tutorials/customer return prediction/images/22.JPG (30971, 2021-12-15)
tutorials/customer return prediction/images/23.JPG (76713, 2021-12-15)
tutorials/customer return prediction/images/24.JPG (45886, 2021-12-15)
... ...

# DataHub Interactive Education (DINE) [![REUSE status](https://api.reuse.software/badge/github.com/SAP-samples/datahub-dine)](https://api.reuse.software/info/github.com/SAP-samples/datahub-dine) ## Overview Data Hub INteractive Education(DINE) is an educational content for [SAP Data Hub](https://www.sap.com/products/data-hub.html). Our hands-on exercises are developed to show you how to use SAP Data Hub features. SAP Data Hub allows you to connect to different data sources such as SAP HANA, SAP ERP, SAP BW, Oracle DB2, SQL Server, and many more and can process various data types; structured, semi-structured and unstructured using Kafka, streaming engine, text and image analysis, etc. SAP Data Hub can bring all your data together so you can work across them seamlessly. You can quickly develop your prototype on SAP Data Hub and the result can be easily turned to a production level system since SAP Data Hub takes care of execution, orchestration, scheduling, and monitoring. SAP Data Hub is developed on Kubernetes and therefore it is deployable on premise or in the cloud. It runs on a distributed execution engine and is designed for Big Data world by proving understanding on metadata in a Big Data landscape. Also go through the [official documentation](https://help.sap.com/viewer/p/SAP_DATA_HUB) of [SAP Data Hub](https://www.sap.com/products/data-hub.html) DINE makes it easy to learn how to build pipelines in SAP Data Hub using its operators . It acts as reference for application developers and showcases the features of Data Hub in an easy to understand business scenario. This demo content comes complete with: - Sample data - Code snippets - Tutorials ## Prerequisites SAP Data Hub Setup - Follow the [Installation Guide for SAP Data Hub](https://help.sap.com/viewer/e66c399612e84a83a8abe97c0eeb443a/2.4.latest/en-US/9f866d8ef9a94c30947f12e73eaf0dd9.html) and setup your SAP Data Hub environment. You can also use [SAP Data Hub Developer Edition](https://blogs.sap.com/2017/12/06/sap-data-hub-developer-edition/) or [SAP Data Hub Trial Edition](https://blogs.sap.com/2018/04/26/sap-data-hub-trial-edition/) ## Scenarios ![Alt text](./data/Picture1.png "Optional title") We will learn SAP Data Hub through the below scenarios which are based on dummy entity called as SAP Data Hub Market Place , an e-commerce platform which is developed for the purpose of demo and learning, where customers across the globe make thousands of purchases everyday. The scenarios are detailed below: - [Customer Return Prediction](./tutorials/customer%20return%20prediction/README.md) : This scenario is used to identify the products which can frequently be returned by the customer based on different parameter. This scenario is implemented is Python and uses sklearn library to implement decision tree classifier algorithm. Here in this scenario we are reading data from different data sources and using SAP Analytics cloud to visualize the result dataset. Follow the [tutorial](./tutorials/customer%20return%20prediction) to implement this scenario. More scenarios can be found in the [teched-2018](https://github.com/SAP/datahub-dine/tree/teched-2018) branch. ## Datasets Our dataset for the above scenarios comprise of 6 files, which contain customers, products and sales information. - CUSTOMER table has details of customers , this table has ADDRESSID which is mapped to ADDRESS table where details of customers address are stored. - When a Customer buys a Product, Sales Order is generated (SO_HEADER) and each sales order has multiple order items (SO_ITEM). - SO_HEADER has PARTNERID , a foreign key which links to CUSTOMER table. - SO_ITEM has SALESORDERID, a foreign key which links to SO_HEADER. - Each SO_ITEM will have PRODUCTID which is mapped to PRODUCT table where details of products are stored. - Customer Reviews about the products are stored in REVIEW table. - Information about returns made by customers are stored in RETURN table. - So basically we have 7 tables. > It is sythetic dataset derived from [SHINE](https://github.com/SAP/hana-shine-xsa) and is enriched to suit our usecases ### ER Diagram ![Alt text](./data/images/er_diagram.jpg "Optional title") To access the datasets, explore the [data](./data) folder in this repository. ## Known issues None ## Support Please use GitHub issues for any bugs to be reported. ## License Copyright (c) 2017-2020 SAP SE or an SAP affiliate company. All rights reserved. This project is licensed under the Apache Software License, version 2.0 except as noted otherwise in the [LICENSE](LICENSES/Apache-2.0.txt) file.

近期下载者

相关文件


收藏者