Insights

Using a CSV file as a Data Source in Gatsby

You can view the current build of the project on Netlify via https://gatsby-csv.netlify.com/

The final source code for this project is available in its public GitHub repository.

You can also fork and modify the project on CodeSandbox at its public box.

Introduction

Recently I worked on a client project which required the use of a CSV file to populate a mapping tool with data. This CSV file acted as if it were a database, giving context to locations.

After exploring options in converting the CSV to JSON using a node express server, I found that converting 60,000+ rows of data to endpoints of JSON would result in huge routes, that when saved as a file totalled 250MB in size.

The large endpoints, amongst issues with hosting resource-heavy applications, wasn’t an ideal solution and again exploration of alternatives began.

Which leads me to the content of this blog post.

As the main mapping tool was created using Gatsby, I started to explore how I could convert the file to JSON at build-time instead of run-time.

Fortunately, I found a Gatsby plugin which allowed me to drop a CSV file into my Gatsby data directory and build a GraphQL schema based on that file during build-time.

While the solution I detail in this blog wasn’t the final implementation in the mapping tool, it did serve as a lesson into how adaptable Gatsby is and what amazing plugins are on offer to developers using the framework.

The Challenge

To explore the Gatsby CSV data plugin, I’ve set the challenge to create a Gatsby site which converts a CSV file to a GraphQL schema which can be used in page components to list the nodes found in the schema (e.g. list of people).

We will also set the challenge of creating individual pages for each node found in this new schema using the Gatsby node file.

Solution

The final solution is a Gatsby website generated and hosted for free on Netlify available at https://gatsby-csv.netlify.com. All of the source code is managed on CodeSandbox but could be pulled down from its GitHub repository if you wanted to clone the example in a local environment.

Solving The Challenge

Getting Some Data

To explore the Gatsby CSV data plugin, we first need a CSV file to experiment with.

I did a search on GitHub for a random .csv file and came across this ‘people-example.csv’ found in a development course.

Note: The columns in the file were inconsistent so I've formatted them consistently to avoid confusion when querying later.

The CSV file is small with five records of information on people. Each record contained information including - 

  • First name
  • Last name
  • Age
  • Country of Residence

Gatsby Starter Kit

To get started on the site, I first forked a basic Gatsby starter kit which included some basic plugin configuration for data sources, web manifest files and gatsby image transforming.

After forking the starter, I then scraped out all of the extras that weren’t required for this example to remove any confusion to developers who inspected the source code for the project.

Including the Plugin

Next, I needed to make the CSV contents available to Gatsby as a GraphQL schema. This was done with the Gatsby Source CSV transformer plugin.

The Gatsby source CSV transform plugin requires the existing Gatsby project to already be using the ‘gatsby-source-filesystem’ plugin. The source filesystem plugin makes specified directories available to Gatsby during the build process.

Once both plugins have been installed with the use of the command line npm i gatsby-source-filesystem gatsby-transformer-csv you will then need to add the following configuration to your gatsby-config.js file.

 {
      resolve: `gatsby-source-filesystem`,
      options: {
        name: `images`,
        path: `${__dirname}/src/images`,
      },
    },
    {
      resolve: `gatsby-source-filesystem`,
      options: {
        name: `data`,
        path: `${__dirname}/src/data/`,
      },
    },
    `gatsby-transformer-csv`
}

Our New GraphQL Schema

With this configuration, when you run gatsby develop you will now have a new schema available to you in your GraphQL explorer (commonly found at http://localhost:8000/___graphql).

The name of this new schema is based on the filename of your CSV file. So by using the CSV file name people.csv, I now have access to nodes under PeopleCsv in my GraphQL schema.

If I wanted to access all nodes created by the CSV transformer plugin, I would access allPeopleCsv and query the nodes found under the schema.

Querying Our New CSV GraphQL Schema

We can use this new schema on our front-end homepage to list out all of the people we find in our CSV file with the use of Gatsby’s StaticQuery component.

This example would be best used with pagination but as we only have five records, we are mapping all records found.

export default () => (
  <StaticQuery
    query={graphql`
      query AllPeople {
        allPeopleCsv {
          nodes {
            Age
            Country
            FirstName
            LastName
          }
        }
      }
    `}
    render={data => <IndexPage data={data} />}
  />
)

Now our IndexPage component has access to the nodes data returned by the GraphQL query. We can now map over the array of nodes found and link through to pages (if available) which we will go into next.

const IndexPage = ({ data }) => (
  <Layout>
    <SEO title="Home" />
    <h1>Using a CSV as a data source in Gatsby</h1>
    <p>These people were found in the CSV file.</p>
    <ul>
      {data.allPeopleCsv.nodes.length > 0 &&
        data.allPeopleCsv.nodes.map(person => (
          <li>
            <Link to={`${person.FirstName}-${person.LastName}`}>
              {person.FirstName}
            </Link>
          </li>
        ))}
    </ul>
  </Layout>
)

By mapping over the data which is available from our StaticQuery component, we are outputting a list of the people found in our CSV file.

You can also see that each person is wrapped in a Link component which is imported from the gatsby node module. This will allow us to create an anchor link to the individual pages once we’ve generated them with our gatsby-node.js file.

Creating Pages from CSV Data Programmatically

To create individual pages for each person found in our CSV file we will first need to create a gatsby-node.js file in our project root directory.

Within this file, we will want to call upon the createPages API that Gatsby provides, create a GraphQL query on our new schema and then run the createPage function on each node we find in our results (if we get any).

Read more about programmatically creating pages using the Gatsby API from their official post.

const path = require(`path`)
exports.createPages = ({ graphql, actions }) => {
  const { createPage } = actions
  const peopleTemplate = path.resolve(`src/components/templates/people.jsx`)
  return graphql(
    `
      query AllPeople {
        allPeopleCsv {
          nodes {
            Age
            Country
            FirstName
            LastName
          }
        }
      }
    `
  ).then(result => {
    if (result.errors) {
      throw result.errors
    }
    // Create people  pages.
    result.data.allPeopleCsv.nodes.forEach(person => {
      const slug = person.FirstName + `-` + person.LastName
      createPage({
        path: slug,
        component: peopleTemplate,
        context: {
          ...person,
        },
      })
    })
  })
}

Improvements

To help better organise the codebase, I’ve created a GraphQL fragment which is used on the index page to query all fields specified in the fragment.

src/graphql-fragments/PeopleCSV.jsx

import { graphql } from "gatsby"
export const PeopleCsvFragment = graphql`
  fragment PeopleCsvFragment on PeopleCsv {
    Age
    Country
    FirstName
    LastName
  }
`

src/pages/index.js

 query AllPeople {
        allPeopleCsv {
          nodes {
            ...PeopleCsvFragment
          }
     }
 }

You can read more about GraphQL fragments and how to include them over in my other blog post. They are great for keeping your projects clean and removes the need for long files.

Conclusion

It’s great to see what’s possible with Gatsby and the support developers are providing with new third-party plugins becoming available to the frameworks ecosystem.

If you’re looking of building your own implementation of source plugins, do a quick search before you get started. You might save yourself a tonne of time or get a headstart from other developers efforts!

Project Links

Again, you can view the current build of the project on Netlify via https://gatsby-csv.netlify.com/

The final source code for this project is available in its public GitHub repository.

You can also fork and modify the project on CodeSanbox at its public box.

Want More?

Interested in GraphQL, Gatsby or React? Follow me on Twitter @whatjackhasmade and reach out to discuss more on the technologies, or follow me for news related to the topics.

Continue Reading 📚

Tell Me About Your Project