top of page
  • garywalton05

Building a Comprehensive Workforce Analytics Dataset: Unveiling the Need and Key Elements

In pursuit of creating an impactful workforce analytics portfolio, we have attempted to explore publicly available datasets. Whilst our sample is small (3 of likely hundreds) I have evaluated each against essential criteria, seeking the perfect foundation to showcase my skills and expertise. While these datasets presented potential to derive insights in specific areas, the need for a comprehensive dataset that encompasses all aspects of workforce analytics is still present. Below, I consolidate findings from the dataset reviews and outline the key elements required to create a comprehensive dataset suitable for a suite of workforce analytics reports and analytical dashboards.

Dataset Reviews Recap:

  • Score: Data Volume (2/10), Data Completeness (3/10), Data Quality (5/10), Realism and Complexity (4/10), Data Availability (10/10)

  • Conclusion: This dataset lacks the diversity of workers and timeframes required for in-depth analysis. The absence of worker history and crucial HR dimensions restrict its usability.

  • Score: Data Volume (5/10), Data Completeness (3/10), Data Quality (9/10), Realism and Complexity (2/10), Data Availability (10/10)

  • Conclusion: While this dataset offers specific insights, it lacks worker history and essential HR dimensions, limiting its scope for workforce analytics.

  • Score: Data Volume (8/10), Data Completeness (5/10), Data Quality (7/10), Realism and Complexity (9/10), Data Availability (10/10)

  • Conclusion: This dataset provides valuable insights, but its limited scope inhibits answers to critical questions, emphasizing the need for a comprehensive dataset.

The Need for a Comprehensive Workforce Analytics Dataset:

From our dataset reviews, it becomes evident that a single, publicly available dataset may not fulfil the diverse requirements of a comprehensive workforce analytics portfolio. To showcase the full range of our skills and insights, I therefore recognize the necessity of creating my own dataset, thoughtfully curated to include the following key elements:

  1. Data Volume: A sizeable dataset with a substantial number of workers across a reasonable timeframe to enable thorough analysis of trends and patterns.

  2. Data Completeness: A dataset containing all necessary dimensions, including demographic and job data, alongside comprehensive fact and event tables, allowing for a holistic approach to workforce analytics.

  3. Data Quality: Data cleaning processes to ensure accuracy, consistency, and reliability, guaranteeing the validity of the insights derived.

  4. Realism and Complexity: Reflection of real-world scenarios and challenges, incorporating multiple business units, diverse geographical locations, and varying employee types for robust analysis.

  5. Data Availability: Ethically and legally sourced data, ensuring the dataset is suitable for use in our portfolio.

Creating A Comprehensive Workforce Analytics Dataset:

To create the dataset as envisioned, I will curate synthetic data that will emulate real-world scenarios pulled from my experience with a multinational medical, technology, and manufacturing corporation. The dataset will encompass diverse worker profiles, including comprehensive worker histories, manager assignments, business structure, geographic representation, compensation details, employee engagement and performance, essentially everything that would be expected when using a platform such as SuccessFactors or Workday.



bottom of page