Azure
Last updated
Last updated
Azure is a cloud computing platform that offers PaaS, IaaS and SaaS, supporting many languages and frameworks.
On Premise:
PaaS (Cloud Native): Platform as a Service
IaaS: Infrastructure as a Service
SaaS: Software as a Service
Services / Resources:
Compute: Web Apps, VM's, Mobile Service, Cloud Service, Batch Service
Storage: Blob Container, Tabular (Cassandra), Queue, File Share, Data Lake
Network:
Processing Big Data:
Messaging:
Caching:
Identity and Access:
Mobile:
Backup:
Create free account and install azure-cli:
Login:
A clear hierarchical structure simplifies management of resources. A tenant is a representation of an organization, a dedicated instance of Azure Active Directory that registers and manages apps, and the users and systems that have access to them. A company may have multiple subscriptions to separate concerns. Each subscription (department) can have multiple resource groups (systems), that hold multiple resources (services), which again can have subresources:
The infrastructure of an application is typically made up of many different components, which should ideally be deployed, managed and monitored as a group. ARM provides a secure way to manage resources using a template, the portal, or the Azure CLI.
It is important to know that CLI tools use REST requests to the Azure portal in the background. The portal generates the templates that deploy extra resources.
Tip: An important concept in every deployment request is idempotence.
Templates offer a way to describe resource deployment in a declarative way. Templates are idempotent.
The portal makes an http request to the ARM API in the background, which is idempotent.
The CLI tool makes https requests to the ARM API in the background, so it's also idempotent.
Resources can be created with language-specific SDK's.
Specifying the --mode
flag:
Incremental: Resources not specified will be left untouched
Complete: Resources not specified will be destroyed
Deploy the template:
Deploy using command:
Login and create resource group:
Create Storage Account:
A data factory is an automated pipeline where activities such as ingestion, processing and publishing can be scheduled.
Tip: follow along with a tutorial to set up a pipeline.
First, a storage account is required within a resource group to have a place to ingest the data:
Create a new blob container
Upload a text file named "emp.txt" in the advanced tab, to the folder called "input", containing:
John, Doe
Jane, Doe
We have a blob container containinig a txt file. Now we need a data factory:
Open a new browser tab with the Azure portal
Create a new resource, under Analytics select Data Factory
Enter a name and click create
When it's created, click on Author and Monitor
Create a linked service:
In the newly opened tab, click Author
In the left bottom under Connections, create a new Linked Service
Select Azure Blob Storage, enter subscription etc, and press Test Connection, and Finish
Create datasets:
Create a new Dataset
Choose Azure Blob Storage
Enter "InputDataset" as name
Under Connection select the linked service and click browse to select the emp.txt file
Repeat the steps to create the output dataset
Create a pipeline:
Create a pipeline
Give it a name, then pick Copy Data under Move & Transform
Select the source and sink (InputDataset and OutputDataset)
Debugging the pipeline:
Click debug in the pipeline tab
In the first browser tab (that you held open hopefully), verify that the output file has been created
Publishing, triggering and monitoring the pipeline:
Instead of debug, click trigger
Monitor the pipeline by clicking the monitor tab in the left sidebar
This has been a simple copy operation, but it can easily be expanded to be a more complex process. The tutorial goes on to describe scheduling as well.
There are many tools and information with respect to solutions that Azure offers, for example: