The government spends millions of dollars and hires thousands of individuals to analyze data on behalf of the American people. As technology changes and new tools become available each year to analyze data more quickly and cost-effectively, there is a reluctance to adopt these new technologies.
1. Problem or Opportunity Statement: The government spends millions of dollars and hires thousands of individuals to analyze data on behalf of the American people. As technology changes and new tools become available each year to analyze data more quickly and cost-effectively, there is a reluctance to adopt these new technologies. We aim to create an informational resource of analytical tools that could be used by various groups in the Federal Government. For example, a procurement officer would find a central repository of data analytic tools a valuable resource during their market research evaluating the breadth of analytical tools available for conducting similar types of data analysis or displaying similar type of information. Data analysts encounter limits in their current tools (e.g., Excel is limited to about a million observations) and they would be able to use the central repository as a resource to understand what additional tools might be available to them. A section chief or the head of an agency might be unfamiliar with new open-source software preferred by newly-minted data scientists or uncertain about the risks that the open-source software poses to their agency. These policymakers may benefit from knowing that many other agencies in the Federal Government are already using the tool.
2. Identify the root cause: During two initial focus groups to understand the types of software programs being used at different agencies in the Federal Government to analyze data, our initial interviews suggest an increasing consensus by data scientists around the use of open-source software (e.g. Python and R) to conduct analysis of large datasets. We also found preliminary evidence that the adoption of these platforms in the Federal Government has been slow for three reasons: (1) unfamiliarity with new tools, (2) uncertainty about the risks, and (3) the proficiency of their current staff on legacy systems. We need to further understand the issues around data analysis and the adoption of new technologies in the Federal Government. We propose to create an Interagency Survey of Data Analytical Tools (ISDAT) to better understand what is currently in use and what software data analysts would like to learn in the future. The data from the ISDAT will be organized into a central repository. We believe making this information to everyone in the Federal Government will have a positive impact.
3. Benchmarking and Market Research Plan: We propose doing an initial survey that is sent out to a small group of individuals to determine whether the questions we asked are understandable and we are receiving useful information. After making any adjustments necessary to the survey instrument, we will send the survey to everyone that might conduct data analysis in each of the five agencies in the pilot of the ISDAT.
4. Goals: To develop the ISDAT survey and provide the information from the survey in a searchable database of the best tools for analyzing data among five different government agencies by August 2020. We have three goals with regard to this project:
5. Project Description: We propose to develop a pilot project for the ISDAT to identify the tools being used to analyze data by five different agencies - Department of State, Department of Health and Human Services, Department of Education, US Patent and Trademark Office and the National Credit Union Administration. The pilot project could be expanded by future EIG Results Groups to include other agencies. We would gather the information collected from our surveys of the five agencies into a searchable database. We propose that the content of our project be housed on the website of the Partnership for Public Service.
6. Impact and results: We aim to make it easier for our stakeholders to understand what tools are available to them to analyze data. Ultimately what we are proposing would be successfully measured after our EIG program concludes. Ideally, we could also create a roadmap or some type of friendly template for future cohorts to build on or other agencies to begin utilizing. In the short term, success will be measured by the ability of our group to create the survey instrument, how many analysts respond to our survey, the total number of agencies we get to respond, tally the type of data they analyze (stats, financial, economical), and how we are able to organization the information collected from the survey. Based on number of responses, we will test our data through data visualization (creating graphs) to better understand the results and trends of what we are seeing.
7. Stakeholder Engagement:
8. Risks and Constraints One risk is that not enough people will respond to the survey. Nonresponse rates would impede our ability to achieve our project goal. Additionally, we could have sampling issues, in that the people we survey may not be an accurate representation of the audience we are trying to reach. We are also assuming we have a clear path to house the data. At this stage in the project planning, it is unclear whether the data will be made available for wider use after we have gathered it. Finally, in the future will there be a mechanism to update the data? We will need to build out a mechanism for expanding or updating our platform for future use.