Home » Projects » Crowd-sourced Parks in NYC

How different users use and perceive NYC parks?

To see the final deliverable, click here

To date, the practice of public space data collection still followers the traditional collection method such as observational studies and questionnaires. However, the emergence of social media has changed the way scholars understand public space and people’s behaviors. In this project, we explored the possibility of using user comments as a measurement of the user perception on urban parks.

User comments are collected from the two most popular traveller community websites: TripAdvisor and GoogleMaps. Compared to traditional data-gathering methods like interviews, observational studies, and questionnaires, the proposed method is more cost-effective, covers a wider time range, has a larger sample size, and is more accessible to the researchers.

Descriptive Statistics

In summary, this study collected 11604 TripAdvisor and 28053 GoogleMap comments.

After verifying the sentiment and the rating of each comment, we found that most of the entries are positive comments.

distribution of sentiment

distribution of sentiment2

Summary

We confirmed the empirical results that most park users expect a good park to be clean, well-maintained, safe, and welcoming. Park size, types of amenities, and traffic volume, however, are based on personal preferences. Other factors, for example, temperature and weather conditions, are associated with geographic location and seasons.

However, as an extremely naive and exploratory attempt, there exist limitations and biases in both the data collection and analysis process.

First, the method is absolutely not generalizable and still somehow labour-intensive in terms of data collection and analysis. Even though with the assist of automated web-scraping scripts, the list of parks to be collected are still needed to be manually created. The model also required a lot of fine-tuning to achieve a relatively high accuracy in sentiment analysis and feature extraction.

in terms of biases, Voluntary response bias exists in the dataset as the reviewers are self-selected volunteers, which can lead to polarization. Secondly, not all users have access to the online platform.