Community Workshop Report_April 2020.pdf

Page 1 of 25

FABRIC Community Visioning Workshop Report

April 15-16, 2020

Executive Summary 2

Motivation 2

Participation 2

Overarching Themes 3

Participant Feedback on FABRIC Architecture 5

Breakout Sessions: Major Highlights 6

IoT/Edge 6

AI/ML 7

Security 8

Network 9

Big Data 9

Lightning Talks 10

Major Takeaways 10

Future Workshops 10

Upcoming Roundtables 11

Appendices 11

Agenda 12

Articles, Papers and Resources Suggested by Workshop Participants 14

Questions and Answers 15

Submitted White Papers 22

FABRIC is supported in part by a Mid-Scale RI-1 NSF award under Grant No. 1935966

Page 2 of 25

Executive Summary

The FABRIC Community Visioning Workshop brought together over 200 active participants to

listen to updates on the FABRIC architecture and schedule as well as provide input on the initial

design and use cases. A combination of talks and break out sessions ensured a rich dialog

between the community and the FABRIC team and, despite the virtual format, participants were

actively engaged. There were a multitude of insightful take-aways that the FABRIC team is

using to finalize the technical architecture, refine our community engagement strategy to include

shorter format sessions, and reach out to additional users. Given the virtual format and robust

online interaction, a substantial amount of Questions and Answers are included as an Appendix.

Motivation

Building a new testbed without the broader CISE and domain science community input on use

cases and technical features prior to production would be counterproductive, and we don’t

believe in a “build it and they will come” approach. We purposefully began our series of FABRIC

events with a Community Visioning Workshop to gather early input and feedback from

experimenters on specific use cases and to obtain partial validation of the proposed initial

design prior to final decisions. Though we began our workshop series with those who are by

and large active testbed users and advocates, our vision for future workshops is to include

experimenters who have not previously relied upon testbeds, but who could benefit from them.

Our goals are to be inclusive of the CISE and domain science communities, and remain

open-minded as to potential use cases while focusing on the initial Science Drivers.

Participation

Participants were originally required to submit a short white paper to attend the in-person

workshop, which was held in conjunction with the NSF Huge Data workshop. There were 51

papers submitted. The final lightning talks were selected from this initial pool. Talks were chosen

to represent a breadth of topics. There were 56 people registered for the original in-person

workshop, and we had planned an estimated total of 85 people in attendance, including

registrants, the FABRIC leadership team, FABRIC science design drivers, NSF Program

Officer(s), and the FABRIC advisory committee.

Due to COVID, the team decided early to move to a virtual workshop for the safety of all

participants. With the move to an abbreviated, online format, registration was opened up. This

change of format resulted in 328 registered for Day 1 and 308 registered for Day 2. Day 1 had

11 Panelists* and 220 total users attend, with 170 as the max concurrent views**. On Day 2, we

had 14 Panelists*, 152 total users attend, with 130 as the max concurrent views** An estimated

147 institutions across 15 countries were represented at the workshop, including: Brazil,

Page 3 of 25

Canada, China, Germany, Ireland, Italy, Japan, Netherlands, New Zealand, Republic of Korea,

Slovakia, South Africa, Spain, United Kingdom and the US.

Participants interacted with the speakers by asking questions via Zoom and interacted with each

other and the FABRIC team via a community Slack channel.

*Panelists were members of the FABRIC leadership team and presenters.

**Maximum number of online viewers at the same time during the webinar, excluding Panelists.

Overarching Themes

Data: Although FABRIC funding is for construction of a testbed, data is crucial to every aspect

of its operation, highlighting the importance of addressing open questions about data.

Discussions revolved around four aspects: 1. the challenge of acquiring data for experiments,

the availability and re-use of it by other experimenters (leading to better reproducibility) as well

as its storage; 2. the type of data and metadata that would be made available from the testbed

infrastructure itself; 3. access to various types of “sensitive” data such as real world

packet-traces, data that reflects the networking characteristics of data coming from other

countries/regions, medical-type data and industry data sets; and 4. FABRIC as a vehicle for

experimentation on moving big data.

There were questions on the capabilities of testbed for supporting features related to data

integrity and privacy, especially as experimenters include new types of data for testing

algorithms and test new types of IoT/Edge systems with novel data collection capabilities.

Access to sensitive data such as medical imaging for AI/ML experiments and to “normal”,

realistic, network traffic for security testing was brought up. Notions of what application data can

appropriately and safely be used on a testbed vary considerably, and more clarity is needed

from the project in working with community and data experts. Privacy techniques for the

collection and sharing of such data need to be identified and enabled. Despite the privacy

challenges, security experts believed FABRIC could be used as a platform to help inform the

community on data sharing practices for cyber-detection and attack related data sets.

A topic of particular interest was the ability to generate traffic in FABRIC slices based on real

traffic traces acquired from providers or based on modeling realistic traffic. This will require

further input from the community for acquiring, sanitizing and curating traffic traces from

providers and building mechanisms that would allow replay or generation based on this data

inside the experiments, likely as part of tooling provided to FABRIC by respective experimenter

communities.

Measurement: The need for “fine grained measurements capabilities” came up quite a bit, and

the varying interpretations of the term “measurement” came to light. For example, some want to

gather telemetry data at the optical layer from transponders to expose micro events that could

Page 4 of 25

serve as a feedback loop for improving measurement tools. Others asked about the

measurement of the health of the FABRIC infrastructure itself - specifically, the collection of

server statistics such as cores, CPU and memory stats as well as network measurements such

as Netflow statistics and “packet by packet monitoring where needed”. For IoT and real time

sensors at the edge, measurements of jitter and latencies were seen as critical. Varying

opinions on public measurement/visibility into experiments were expressed.

Storage: Storage continues to be a challenge for both experimenters and cyberinfrastructure

providers. While FABRIC plans to have a fair amount of scratch disk and short term storage,

open questions remain about how much infrastructure data and metadata to store, and how to

work with experimenters on obtaining access to data sets (if at all). One participant commented

that experiment reproducibility becomes a concern when the data volumes get large enough

that storage is not easy or possible, as well as when compute/storage becomes transient and/or

distributed. The need for a distributed resilient storage system (both FABRIC-related and in

general) came up, as questions arose about the amount of persistent, permanent storage

provided by FABRIC and the possibility of enabling some storage at a cloud or external provider.

Security and Privacy: Security and privacy feedback fell into one of two categories: 1.

concerns about FABRIC safely interacting with the real world since many experiments won’t

work in a closed network and need access to real traffic; and 2. security of the FABRIC

infrastructure and software, since security assurances are critical to support many use cases

(e.g., medical use cases). European participants surfaced the GDPR aspects of data, with

University of Amsterdam offering to share their document on guidelines around data that may

and may not enter their own testbed and how and where it’s stored.

Collaboration & Community: A myriad of projects expressed interest in working with FABRIC.

These ranged from recent NSF CC* and MidScale awards (such as SAGE NSF#1935984), to

existing testbeds, to up-and-coming International efforts, as well as stand-alone projects. A few

asked the community to think about testbed co-development along with collaborative

experimentation. The recent NSF IRNC solicitation and its testbed area mentioning FABRIC was

discussed as a vehicle for such collaboration. A comment was made that the networking and

systems research community has typically conducted experiments on only one (or perhaps

two/three) testbeds and that experiments (software, scripts to run experiment trials, analysis

software, etc.) are often not portable. Changing this paradigm and being able to conduct

experiments across testbeds (i.e., FABRIC as a testbed of testbeds) was deemed important.

Engaging with other large NSF-funded projects, both in CISE (e.g. the software institutes,

centers of excellence, ...) as well as in the science directorates, especially the Large Facilities,

was encouraged, as was working with projects like SENSE, AutoGOLE, IRNC testbeds,

SEARRCH and a new global testbed under development by the Global Network Advancement

Group (GNA-G). Some referred to FABRIC as a “testbed of testbeds”.