Skip to content

3.2: Data Stewardship Model

✂️ Tl;dr 🥷

Details a federated data stewardship model crucial for managing the geospatial data assets effectively. The model assigns responsibility for specific data domains to expert Data Stewards fostering accountability and leveraging their specialised knowledge for better data governance. Key steward workflows include rigorous data validation and quality assurance processes, the promotion of valuable user-generated content to official enterprise status, comprehensive metadata management and systematic issue resolution. Stewards ensure data integrity and fitness for use throughout its lifecycle. A suite of tools including ArcGIS Pro, Portal for ArcGIS and an enterprise data catalogue will support them in these tasks. This approach translates governance policies into practice safeguarding data value and building a sustainable data management framework.

3.2.1. Federated Data Stewardship Approach

The recommendations of this architecture is to implement a federated data stewardship model. This approach assigns responsibility for specific datasets or data domains to designated Data Stewards who possess relevant subject matter expertise. Instead of a single, central authority managing all data, stewardship is distributed, empowering individuals or teams closer to the data's creation, use and business context.

Rationale: A federated model brings several benefits:

  • Domain Expertise: Stewards with deep knowledge of specific data domains are better equipped to make informed decisions about data quality, appropriate use and lifecycle management.
  • Accountability and Ownership: Assigning stewardship at the domain level fosters a stronger sense of ownership and accountability for the quality and integrity of those specific datasets.
  • Scalability: As the volume and complexity of geospatial data grow, a federated model can scale more effectively than a purely centralised one.
  • Responsiveness: Stewards can respond more quickly to data-related issues and requirements within their domain.

3.2.2. Data Steward Workflows and Processes

Data Stewards are responsible for executing and overseeing several critical workflows to ensure the effective governance of their assigned datasets. These workflows are designed to be systematic and repeatable, leveraging automation and defined procedures where possible.

3.2.2.1. Data Validation and Quality Assurance

A primary responsibility for Data Stewards is to ensure that datasets meet established quality standards. This involves:

  • Defining Validation Rules: In collaboration with Data Owners and technical teams, Data Stewards help define specific validation rules for their datasets. These rules cover:
    • Schema Conformance: Ensuring data adheres to the defined table structures, field types and relationships, particularly for data within the user-managed Enterprise Geodatabase.
    • Attribute Domain Validation: Verifying that attribute values fall within permissible ranges or coded value domains.
    • Spatial Integrity: Checking for valid geometries, correct spatial references and, where applicable, topological consistency. PostGIS functions can support automating these checks.
    • Business Rule Verification: Ensuring data conforms to specific organisational or operational business rules.
  • Overseeing Validation Processes: While validation checks should be automated, Data Stewards oversee these processes.
  • Reviewing Validation Outputs: Stewards review output of automated validation tools or manual perform checks to identify data quality issues, anomalies, or non-conformities.
  • Initiating Corrective Actions: Upon identifying quality issues, Stewards are responsible for initiating and tracking corrective actions, which may involve coordinating with data creators, GIS Engineers, or Database Administrators.

This proactive approach to data validation helps prevent the propagation of errors and ensures that decision-making is based on reliable, high-quality geospatial information.

3.2.2.2. Promotion of User-Generated Content to Enterprise Status

A key workflow within this recommended model is the process of promoting valuable user-generated or temporary datasets to official "Enterprise Data" status. This typically involves migrating data from the Esri-managed ArcGIS Data Store to the user-managed Enterprise Geodatabase.

The Data Steward plays a central role in this promotion workflow:

  1. Nomination and Initial Assessment:

    • Datasets may be nominated for promotion by users, Data Owners, or identified by the Data Steward based on observed usage, perceived value, or strategic importance.
    • The Steward conducts an initial assessment to determine if the dataset warrants consideration for enterprise status, based on criteria such as data uniqueness, potential for broader organisational use and alignment with business objectives.
  2. Quality and Conformance Validation:

    • The nominated dataset undergoes validation against enterprise data quality standards and schema requirements defined for the target environment.
  3. Transformation and Migration:

    • If the dataset meets quality and structural requirements, the Steward coordinates its migration. This may involve:
      • Data cleansing or transformation to align with enterprise data models.
      • Utilisation of ETL (Extract, Transform, Load) tools or scripts managed by Data Engineers or GIS Engineers.
      • Ensuring complete and accurate transfer to the designated environment.
  4. Approval and Registration:

    • The Data Steward, in consultation with the Data Owner, formally approves the dataset's promotion to enterprise status.
    • The Steward ensures the newly promoted dataset is:
      • Documented with required metadata.
      • Registered in the data catalogue.
      • Accessible through appropriately configured and secured ArcGIS services published from the Azure Database for PostgreSQL.

The following diagram illustrates this promotion workflow:

flowchart TB
    subgraph UserGeneratedSpace ["📝 User-Generated Domain (e.g., ArcGIS Data Store)"]
        direction LR
        UGD["📄 User-Generated Dataset"]
    end

    subgraph StewardReviewProcess ["🛡️ Data Steward Review & Promotion Process"]
        direction TB
        STEP1["Step 1: 📥 Nomination & Initial Assessment"]
        STEP2["Step 2: ✅ Quality & Conformance Validation"]
        STEP3["Step 3: ⚙️ Transformation & Migration (if needed)"]
        STEP4["Step 4: 🔑 Approval & Registration"]
    end

    subgraph EnterpriseSpace ["🌟 Enterprise Domain (e.g., Azure PostgreSQL)"]
        direction LR
        ED["💾 Enterprise Dataset"]
    end

    UGD --> STEP1
    STEP1 --> STEP2
    STEP2 -- Meets Criteria --> STEP3
    STEP2 -- Fails Criteria --> RFTU["↩️ Return to User / Revise"]
    RFTU -.-> UGD
    STEP3 --> STEP4
    STEP4 --> ED

    classDef user_domain fill:#e3f2fd,stroke:#333,stroke-width:1px;
    classDef steward_process fill:#fff9c4,stroke:#333,stroke-width:1px;
    classDef enterprise_domain fill:#e8f5e9,stroke:#333,stroke-width:1px;

    class UGD,RFTU user_domain;
    class STEP1,STEP2,STEP3,STEP4 steward_process;
    class ED enterprise_domain;
Diagram: Workflow for promoting user-generated content to enterprise status, overseen by Data Stewards.

3.2.2.3. Metadata Management

Accurate metadata is crucial for data discovery, understanding, appropriate use and governance. Data Stewards are responsible for overseeing metadata management for datasets within their domain. This includes:

  • Ensuring Metadata Creation: Verifying that new enterprise datasets are accompanied by complete metadata, adhering to organisational standards (e.g., based on ISO 19115 or a defined internal profile).
  • Maintaining Metadata Accuracy: Reviewing and updating metadata to reflect any changes to dataset structure, content, or lineage.
  • Promoting Metadata Use: Encouraging data consumers to utilise metadata to understand dataset characteristics, limitations and fitness for use.
  • Data Lineage Documentation: Ensuring that the provenance of key datasets (their origin, transformations and dependencies) is captured within the metadata or associated documentation.

Tools such as the Portal for ArcGIS metadata editor and enterprise data cataloguing solutions can support Stewards in these activities.

3.2.2.4. Issue Resolution

Data Stewards act as the primary point of contact and coordination for resolving data quality issues within their assigned domains. The issue resolution workflow typically involves:

  1. Issue Identification/Reporting: Quality issues may be identified through automated validation checks, reported by data users, or discovered by the Steward during routine reviews.
  2. Investigation and Root Cause Analysis: The Steward investigates the reported issue to understand its scope, impact and underlying cause.
  3. Coordination of Corrective Actions: Depending on the nature of the issue, the Steward collaborates with:
    • Data Creators/Maintainers: To correct errors at the source.
    • GIS Engineers/Data Engineers: To fix issues in ETL processes, database configurations, or service definitions.
    • Data Custodians: If the issue relates to infrastructure or system-level problems.
  4. Tracking and Verification: The Steward tracks the progress of corrective actions and verifies that the issue has been resolved effectively.
  5. Communication: The Steward communicates the resolution and any associated impacts to relevant stakeholders, including the original reporter and affected data users.

3.2.3. Tools and Technologies Supporting Data Stewards

To effectively execute their responsibilities, Data Stewards will leverage a combination of tools and technologies. These may include:

  • ArcGIS Pro: For detailed data inspection, advanced spatial analysis, data validation and preparation of data for enterprise use.
  • Portal for ArcGIS: For managing item metadata, controlling sharing and access permissions, reviewing data in web maps and interacting with hosted layers.
  • VertiGIS Studio Workflow: For designing and executing guided workflows that can automate parts of data validation, submission and quality control processes, ensuring adherence to governance rules.
  • Beekeer Studio or DataGrip: For direct, read-only inspection of data within the user-managed Azure Database for PostgreSQL, if necessary for deep analysis.
  • Jira: For formally logging, managing and tracking data quality issues and enhancement requests.
  • Enterprise Data Catalogue: For registering, discovering and understanding available enterprise datasets and their lineage.

The effective implementation of this Data Stewardship Model, underpinned by clear roles, defined workflows and appropriate tooling, is essential for maintaining the quality, integrity and value of the platform's geospatial data. This model empowers Data Stewards to be proactive champions of data governance within their respective domains. This structured approach moves beyond ad-hoc data management, establishing a clear and sustainable framework for all stakeholders.