Azure ML

Azure Machine Learning (Azure ML) empowers data scientists and developers to build, deploy, and manage high-quality models faster and with confidence.

TEAM

1 Product Owner
1 Product Designer III
1 UX Researcher
2 Engineering Leads

DELIVERABLES

Wireframes
Prototypes
Usability Testing
HIFI Designs

DURATION

12 months

RELEASES

Microsoft Ignite 2019 (Private Preview)
Microsoft Build 2020 (General Release)

OVERVIEW

Shortly after launching Azure ML 1.0 in 2018, the Azure ML team quickly identified a need for data scientists and developers to be able to upload their datasets directly to the platform, through the UI. Connecting datasets from the Azure Portal was a cumbersome experience, and Azure ML wanted to give its users an alternative to SDK integration.

My role was to design a process by which users could intuitively upload and manage their datasets directly through the Azure ML UI.

HYPOTHESIS

If data scientists and developers are able to upload and manage their datasets directly through the Azure ML UI, we can reduce the friction of existing data ingestion flows leading to increased adoption of Azure’s ML services and increased usage and retention across the entire platform.

CHALLENGES

  1. Azure ML was a new product and still finding its footing outside of the Azure Portal

  2. Design language was still in transition between the Microsoft Azure and Fluent design systems

  3. Foundational research was still a work in progress

  4. Ongoing internal restructuring meant some product direction was fluid

  5. Design team still growing, supporting a much larger engineering organization

UNDERSTAND

My first steps after joining the team were to familiarize myself with the data science and data modeling space. Working with my product and engineering stakeholders I was able to better define the following:

  • Who are our users and what are their needs?

  • What are the feature requirements and business goals?

  • What does the current landscape for dataset ingestion look like?

  • What are the key steps for data ingestion?

KICKOFF WORKSHOPS

PERSONAS

Leveraging existing user research, I worked with the product owner and research team to identify our two key user groups:

Data Scientist | Target User

  • Analyzes data and builds models to solve business problems

  • Business background

  • Less tech savvy, prefers UI based solutions

  • Wants to get up and running as soon as possible

Data Engineer | Existing User

  • Maintains and operates the data

  • Integrates and monitors data pipelines

  • Technical background

  • Familiar with Python and IDE’s

REQUIREMENTS

Phase 1

Microsoft Ignite 2019 | Private Preview


  • Dataset list view

    • table of created datasets

    • action to create datasets

  • Dataset creation

    • define basic info

      • dataset name

      • dataset type

      • description

      • datastore selection

    • view file settings and preview

      • data file settings

      • data preview

    • schema mapping

      • set data type

      • include/exclude dataset features

  • Dataset details view

    • Meta information

    • Explore dataset

    • Dataset models

Phase 2

Microsoft Build 2020 | General Audience


  • Advanced dataset creation

    • inline datastore creation

      • select existing datastore

      • create new datastore

      • description

      • datastore selection

    • set traits on dataset features

      • select trait type (timestamp, label, image, etc)

      • apply trait to feature values

DESIGN GOALS

After determining the business and design requirements, I outlined the design goals:

  • Design a list view for datasets, so that the user can easily view their created datasets and create new ones

  • Design a create experience for datasets, so that the user can intuitively ingest their datasets through the UI

  • Design a create experience that allows for basic data preparation, so that the user can quickly get up and running with their datasets sooner

  • Design a details view for datasets, so that users can see a summary of their dataset details and explore their data

  • Align designs with the Azure and Microsoft Fluent design systems to maintain a consistent look and feel

DISCOVERY

What are other tools that allow the user to create or upload datasets?

KEY TAKEAWAYS

  • Creating or uploading datasets follow common patterns like side panels or modals

  • Being able to quickly create open datasets from public sources

  • Data preparation during creation is uncommon, opportunity there for Azure ML

P1 | WIREFRAMES

Keeping my design goals in mind, my next step was wireframing dataset creation flows.

WHITEBOARD

BALSAMIQ

FIGMA

DATASET INGESTION FLOW

OPTION 1 | Side Panel Creation

Pros
+ Compact design
+ Consistent with existing create patterns in Azure ML

Cons
- Many form fields could lead to a long vertical scroll
- User may easily get lost on which step they are on

OPTION 2 | Step Creation

Pros
+ More screen space for complex creation flows
+ Breaks flow into related steps to help guide the user

Cons
- New component to design and build
- User may get overwhelmed by number of steps

CREATE DATASET | From Web Files

P1 | PROTOTYPE & TESTING

Focusing on the step creation option, I created interactive prototypes to conduct usability testing with the research team.

1-hr moderated interviews on Usertesting.com with five participants. All participants had data science or machine learning backgrounds and were familiar with the Azure ecosystem.

TESTING GOALS

  • How do users respond overall to dataset creation and management?

  • Are the the steps and form fields clear? Do they understand the concept and benefit of setting traits?

  • Is the terminology clear or too specific to Microsoft’s concepts of machine learning?

QUESTIONS

  • How do you organize and manage data?

  • How do you ensure data quality?

  • How do feel about performing data operations in a UI tool?

  • How does the Azure ML datasets feature compare with your current practices around data?

  • Is there anything missing in the datasets feature? Not needed? Anything confusing?

KEY INSIGHTS

Out of five users, one was able to locate the dataset creation start point easily, two found it after some exploration, and two had to be prompted to its location. During exploration, two users surmised it might be in datastores, and two guessed in an authoring flow.
Add a “no data/FRE” state for first-time users to make the creation action more prominent and explore adding more messaging on the datastore view to clarify the relationship between datastores and datasets.

Most users generally understood the purpose of each wizard step, but there was some confusion around terminology, required fields and the downstream impacts for certain input selections.
Add more descriptive messaging for each step, info icons for unclear terminology and links to documentation.

In the schema step, users found setting traits by feature column cumbersome. There could be dozens of columns that they want to assign a single trait.
Explore options for making trait selection its own step so users can bulk assign traits to feature columns (moved to P2).

After dataset creation, four out of five users were uncertain about what steps to take next and how to leverage their newly created dataset with machine learning.
Explore nudges to continue the user in the ML flow with their newly created dataset.

P1 | HIFI DESIGNS

Taking into consideration feedback from usability testing, improvements were made to the wireframes and high fidelity designs were created using components from the Fluent design library.

Wireframe | Dataset Creation

Wireframe | Dataset Details

HIFI | Dataset Creation

HIFI | Dataset Details

P1 | FINAL PROTOTYPES & HANDOFF

High fidelity prototypes were created for final reviews with product owner(s) and handoff to engineering. An early version of dataset creation in Azure ML launched as a private preview for Microsoft Ignite 2019.

P2 | WIREFRAMES

After delivering dataset creation for Microsoft Ignite, our goals for phase 2 were allowing a user to create a datastore and set traits during dataset creation.

OPTION 1 | Datastore Panel Creation

Pros
+ Quicker to implement
+ Reusable component

Cons
- Panel over a panel is not ideal, and could be lead to confusion

OPTION 2 | Inline Datastore Creation

Pros
+ Creation is more contextual
+ Simplifies creation experience into a single view

Cons
- More complex to build
- Increases vertical scroll

OPTION 1 | Set Traits in Panel

OPTION 2 | Inline Set Traits

P2 | PROTOTYPE & TESTING

Focusing on panel vs inline, I created interactive prototypes to conduct usability testing with the research team to better understand which option users preferred for creating datastores and setting traits.

1-hr moderated interviews on Usertesting.com with four participants. All participants had data science or machine learning backgrounds and were familiar with the Azure ecosystem.

TESTING GOALS

  • Is the flow for inline datastore creation clear?

  • Do users understand the concept of “set traits”? Does our terminology match existing concepts?

  • Are users confused by the flow for setting traits during dataset creation??

QUESTIONS

  • How do you feel about creating datastores during dataset creation? Anything missing?

  • What are the main issues you experience with data? Workarounds? What would you change about your data practices?

  • How do you feel about setting traits when a dataset?

KEY INSIGHTS

Overall users felt positive about creating a datastore during dataset creation. It saved them the hassle of having to leave the wizard to create a new one if they forgot to create one already but mentioned the dropdown for selecting existing ones did not have enough contextual information.
Explore other components to use instead of a dropdown to display existing datastores (list view/table).

Most users expressed a general understanding of the interaction design for settings traits (defining items in the top portion would be added to the table below) but they were confused about what exactly were meant by “traits” and the definition of “timestamp fine” and “timestamp coarse” traits.
Discuss more with the product team if we were overloading the user with extra steps during dataset creation. Confusion around traits was a consistent thread in the feedback; explore more options for advanced schema mapping after dataset creation.

P2 | HIFI DESIGNS

Taking into consideration feedback from usability testing, improvements were made to the wireframes and high fidelity designs were created using components from the Fluent design library.

Wireframe | Inline Datastore Creation

Wireframe | Inline Set Traits

HIFI | Inline Datastore Creation

HIFI | Inline Set Traits

P2 | FINAL PROTOTYPES & HANDOFF

High-fidelity prototypes were created for final reviews with product owner(s) and handoff to engineering. Inline datastore creation and set traits in Azure ML launched as a general release for Microsoft Build 2020.

KEY LEARNINGS

  • There are existing patterns and mental models to create and upload datasets

  • Data science is still an evolving field, user personas and needs are constantly changing

  • Focus on simpler ingestion flows, data prep and setting traits can complicate things for the user

  • Overall feedback was positive, users liked having an alternative to dataset ingestion directly through Azure ML

OUTCOMES

  • Private preview for dataset ingestion launched for Microsoft Ignite 2019

  • General release for dataset ingestion launched for Microsoft Build 2020

METRICS | Week of 06/01/20

2.3x more datasets were ingested directly through Azure ML

NEXT STEPS

Continue to explore post-dataset creation flows like EDAT (explore data and transform), allowing a data scientist to experiment on a subset of their data. This allows a data scientist to transform a dataset at a lower cost and time commitment while applying any changes to the larger dataset.

Next
Next

Advata Portal