Azure ML
Azure Machine Learning (Azure ML) empowers data scientists and developers to build, deploy, and manage high-quality models faster and with confidence.
TEAM
1 Product Owner
1 Product Designer III
1 UX Researcher
2 Engineering Leads
DELIVERABLES
Wireframes
Prototypes
Usability Testing
HIFI Designs
DURATION
12 months
RELEASES
Microsoft Ignite 2019 (Private Preview)
Microsoft Build 2020 (General Release)
OVERVIEW
Shortly after launching Azure ML 1.0 in 2018, the Azure ML team quickly identified a need for data scientists and developers to be able to upload their datasets directly to the platform, through the UI. Connecting datasets from the Azure Portal was a cumbersome experience, and Azure ML wanted to give its users an alternative to SDK integration.
My role was to design a process by which users could intuitively upload and manage their datasets directly through the Azure ML UI.
HYPOTHESIS
If data scientists and developers are able to upload and manage their datasets directly through the Azure ML UI, we can reduce the friction of existing data ingestion flows leading to increased adoption of Azure’s ML services and increased usage and retention across the entire platform.
CHALLENGES
Azure ML was a new product and still finding its footing outside of the Azure Portal
Design language was still in transition between the Microsoft Azure and Fluent design systems
Foundational research was still a work in progress
Ongoing internal restructuring meant some product direction was fluid
Design team still growing, supporting a much larger engineering organization
UNDERSTAND
My first steps after joining the team were to familiarize myself with the data science and data modeling space. Working with my product and engineering stakeholders I was able to better define the following:
Who are our users and what are their needs?
What are the feature requirements and business goals?
What does the current landscape for dataset ingestion look like?
What are the key steps for data ingestion?
KICKOFF WORKSHOPS
PERSONAS
Leveraging existing user research, I worked with the product owner and research team to identify our two key user groups:
Data Scientist | Target User
Analyzes data and builds models to solve business problems
Business background
Less tech savvy, prefers UI based solutions
Wants to get up and running as soon as possible
Data Engineer | Existing User
Maintains and operates the data
Integrates and monitors data pipelines
Technical background
Familiar with Python and IDE’s
REQUIREMENTS
Phase 1
Microsoft Ignite 2019 | Private Preview
Dataset list view
table of created datasets
action to create datasets
Dataset creation
define basic info
dataset name
dataset type
description
datastore selection
view file settings and preview
data file settings
data preview
schema mapping
set data type
include/exclude dataset features
Dataset details view
Meta information
Explore dataset
Dataset models
Phase 2
Microsoft Build 2020 | General Audience
Advanced dataset creation
inline datastore creation
select existing datastore
create new datastore
description
datastore selection
set traits on dataset features
select trait type (timestamp, label, image, etc)
apply trait to feature values
DESIGN GOALS
After determining the business and design requirements, I outlined the design goals:
Design a list view for datasets, so that the user can easily view their created datasets and create new ones
Design a create experience for datasets, so that the user can intuitively ingest their datasets through the UI
Design a create experience that allows for basic data preparation, so that the user can quickly get up and running with their datasets sooner
Design a details view for datasets, so that users can see a summary of their dataset details and explore their data
Align designs with the Azure and Microsoft Fluent design systems to maintain a consistent look and feel
DISCOVERY
What are other tools that allow the user to create or upload datasets?
KEY TAKEAWAYS
Creating or uploading datasets follow common patterns like side panels or modals
Being able to quickly create open datasets from public sources
Data preparation during creation is uncommon, opportunity there for Azure ML
P1 | WIREFRAMES
Keeping my design goals in mind, my next step was wireframing dataset creation flows.
WHITEBOARD
BALSAMIQ
FIGMA
DATASET INGESTION FLOW
OPTION 1 | Side Panel Creation
Pros
+ Compact design
+ Consistent with existing create patterns in Azure ML
Cons
- Many form fields could lead to a long vertical scroll
- User may easily get lost on which step they are on
OPTION 2 | Step Creation
Pros
+ More screen space for complex creation flows
+ Breaks flow into related steps to help guide the user
Cons
- New component to design and build
- User may get overwhelmed by number of steps
CREATE DATASET | From Web Files
P1 | PROTOTYPE & TESTING
Focusing on the step creation option, I created interactive prototypes to conduct usability testing with the research team.
1-hr moderated interviews on Usertesting.com with five participants. All participants had data science or machine learning backgrounds and were familiar with the Azure ecosystem.
TESTING GOALS
How do users respond overall to dataset creation and management?
Are the the steps and form fields clear? Do they understand the concept and benefit of setting traits?
Is the terminology clear or too specific to Microsoft’s concepts of machine learning?
QUESTIONS
How do you organize and manage data?
How do you ensure data quality?
How do feel about performing data operations in a UI tool?
How does the Azure ML datasets feature compare with your current practices around data?
Is there anything missing in the datasets feature? Not needed? Anything confusing?
KEY INSIGHTS
Out of five users, one was able to locate the dataset creation start point easily, two found it after some exploration, and two had to be prompted to its location. During exploration, two users surmised it might be in datastores, and two guessed in an authoring flow.
Add a “no data/FRE” state for first-time users to make the creation action more prominent and explore adding more messaging on the datastore view to clarify the relationship between datastores and datasets.
Most users generally understood the purpose of each wizard step, but there was some confusion around terminology, required fields and the downstream impacts for certain input selections.
Add more descriptive messaging for each step, info icons for unclear terminology and links to documentation.
In the schema step, users found setting traits by feature column cumbersome. There could be dozens of columns that they want to assign a single trait.
Explore options for making trait selection its own step so users can bulk assign traits to feature columns (moved to P2).
After dataset creation, four out of five users were uncertain about what steps to take next and how to leverage their newly created dataset with machine learning.
Explore nudges to continue the user in the ML flow with their newly created dataset.
P1 | HIFI DESIGNS
Taking into consideration feedback from usability testing, improvements were made to the wireframes and high fidelity designs were created using components from the Fluent design library.
Wireframe | Dataset Creation
Wireframe | Dataset Details
HIFI | Dataset Creation
HIFI | Dataset Details
P1 | FINAL PROTOTYPES & HANDOFF
High fidelity prototypes were created for final reviews with product owner(s) and handoff to engineering. An early version of dataset creation in Azure ML launched as a private preview for Microsoft Ignite 2019.
P2 | WIREFRAMES
After delivering dataset creation for Microsoft Ignite, our goals for phase 2 were allowing a user to create a datastore and set traits during dataset creation.
OPTION 1 | Datastore Panel Creation
Pros
+ Quicker to implement
+ Reusable component
Cons
- Panel over a panel is not ideal, and could be lead to confusion
OPTION 2 | Inline Datastore Creation
Pros
+ Creation is more contextual
+ Simplifies creation experience into a single view
Cons
- More complex to build
- Increases vertical scroll
OPTION 1 | Set Traits in Panel
OPTION 2 | Inline Set Traits
P2 | PROTOTYPE & TESTING
Focusing on panel vs inline, I created interactive prototypes to conduct usability testing with the research team to better understand which option users preferred for creating datastores and setting traits.
1-hr moderated interviews on Usertesting.com with four participants. All participants had data science or machine learning backgrounds and were familiar with the Azure ecosystem.
TESTING GOALS
Is the flow for inline datastore creation clear?
Do users understand the concept of “set traits”? Does our terminology match existing concepts?
Are users confused by the flow for setting traits during dataset creation??
QUESTIONS
How do you feel about creating datastores during dataset creation? Anything missing?
What are the main issues you experience with data? Workarounds? What would you change about your data practices?
How do you feel about setting traits when a dataset?
KEY INSIGHTS
Overall users felt positive about creating a datastore during dataset creation. It saved them the hassle of having to leave the wizard to create a new one if they forgot to create one already but mentioned the dropdown for selecting existing ones did not have enough contextual information.
Explore other components to use instead of a dropdown to display existing datastores (list view/table).
Most users expressed a general understanding of the interaction design for settings traits (defining items in the top portion would be added to the table below) but they were confused about what exactly were meant by “traits” and the definition of “timestamp fine” and “timestamp coarse” traits.
Discuss more with the product team if we were overloading the user with extra steps during dataset creation. Confusion around traits was a consistent thread in the feedback; explore more options for advanced schema mapping after dataset creation.
P2 | HIFI DESIGNS
Taking into consideration feedback from usability testing, improvements were made to the wireframes and high fidelity designs were created using components from the Fluent design library.
Wireframe | Inline Datastore Creation
Wireframe | Inline Set Traits
HIFI | Inline Datastore Creation
HIFI | Inline Set Traits
P2 | FINAL PROTOTYPES & HANDOFF
High-fidelity prototypes were created for final reviews with product owner(s) and handoff to engineering. Inline datastore creation and set traits in Azure ML launched as a general release for Microsoft Build 2020.
KEY LEARNINGS
There are existing patterns and mental models to create and upload datasets
Data science is still an evolving field, user personas and needs are constantly changing
Focus on simpler ingestion flows, data prep and setting traits can complicate things for the user
Overall feedback was positive, users liked having an alternative to dataset ingestion directly through Azure ML
OUTCOMES
Private preview for dataset ingestion launched for Microsoft Ignite 2019
General release for dataset ingestion launched for Microsoft Build 2020
METRICS | Week of 06/01/20
2.3x more datasets were ingested directly through Azure ML
NEXT STEPS
Continue to explore post-dataset creation flows like EDAT (explore data and transform), allowing a data scientist to experiment on a subset of their data. This allows a data scientist to transform a dataset at a lower cost and time commitment while applying any changes to the larger dataset.