This talk will present UNCG ITS Infrastructure Analytics Team approach to "cloud-first" data pipelines, extraction of actionable insight from that data, and presentation of the results in dashboards to relevant stakeholders. Pipelines are established using an on-premises Apache Airflow instance, pushing data primarily into Google BigQuery, but also into Azure Blob Storage. From there, a variety of machine-learning classification algorithms have been written for anomaly detection/outlier elimination as well as metric/status forecasting for VMs, apps, and services. This is accomplished either by executing Python scripts using TensorFlow and SciKit Learn libraries with Google Cloud Functions or PySpark scripts in AzureML within Synapse. Finally, the results are put together in Power BI dashboards for stakeholders at the University. Each path we take is considered in terms of cost and speed, so along with the how, we’ll talk about why we do what we do in our multi-cloud approach.
Note:
Log into Sched to see the "Open Zoom" button. Make sure to complete these
5 steps to get access.