Open Source Software – How to Choose Data Analysis Software – Research Guides at University of Oregon Libraries

OpenRefine (previously known as Google Refine) is an open source pre-analysis software, built for cleaning and transforming messy data. Typical users include the social sciences, humanities, and profit/nonprofit corporations. Free download available. 

Pros
Cons

  • Useful when working with messy data that needs to be cleaned or transformed before use
  • Connects to database APIs
  • Dataset merge capabilities
  • Infinite number of undos/redos
  • The actions taken to a dataset can be extracted and later applied on additional datasets
  • Has web-based interface
  • Uses programming language GREL which is less known and not always intuitive for beginners
  • Is not convenient for data entry; datasets should be uploaded into the program
  • Is less suited for large datasets; single columns cannot be collapsed for better viability 

 

Import and Export File Capabilities

Import: Excel files (.xls, .xlsx), Text files (.csv, .tsv), Web-based files (.xml, .html, .rdf) & additional formats (.json, .tar, .tar gz, Google Spreadsheets, Google Fusion Tables)

Export: Excel files (.xls, .xlsx), Text files (.csv, tsv), Web-based files (.html) & additional formats (.json, .tar, .tar gz)