COMPUTER VISION

Tutorials

Book

Chollet, F. (2017). Deep learning with Python.
Rosebrock, A. (2017). Deep learning for computer vision with Python.
Williams, N. W., Casas, A., & Wilkerson, J. D. (2020). Images as data for social science research: An introduction to convolutional neural nets for image classification. Cambridge University Press.
Solem, J. E. (2012). Programming computer vision with Python: Tools and algorithms for analyzing images.

Demo

Tools

Libraries and packages

Keras: a library for building neural networks.
Pillow (or Python Imaging Library): a library for opening, manipulating, and saving many different image file formats.
scikit-image: A collection of algorithms for image processing, providing functions like image segmentation, edge detection, feature detection, geometrical transformations, etc.
OpenCV: a library for computer vision, image processing, and machine learning.
OpenFace: a facial recognition and analysis library. It provides the detection of facial action units and is particularly useful for emotional analysis.
OpenPose: a library for body detection.
Athec: A library for computational aesthetic analysis of visual media in social science research

Annotation tool

VGG Image Annotator: An offline image annotation tool. It creates an easy-to-use interface for coders to annotate images or videos on their computers.

Computer vision APIs

Microsoft Azure: object recognition, facial detection/recognition/analysis, customized image analysis, image captioning
Face++: facial detection/recognition/analysis, body detection, gesture analysis
Clarifai: object recognition, customized image analysis.
CloudSight: object recognition, image captioning
Google Vision: object recognition
Amazon Rekognition: facial detection/recognition/analysis, object recognition, video analysis
IBM Watson: object recognition
Sighthound: facial detection/recognition/analysis, vehicle analysis

Academic papers

Check out this exciting collection of academic papers that harness the power of computer vision to tackle questions in the realm of social science! I have carefully categorized these papers into various topics to facilitate your exploration. Please be aware that while this list is not exhaustive, its primary aim is to stimulate your interest and inspire future research in this dynamic field. Feel free to reach out to me if there’s a particular topic you’d like to explore further.

Review

Peng, Y., & Lu, Y. (2023). Computational visual analysis in political communication. Research Handbook on Visual Politics, Cheltenham: Edwin Elgar, 42-54.
Bucy, E. P. (2023). Politics through machine eyes: What computer vision allows us to see. Journal of Visual Political Communication, 10(1), 59-68.
Joo, J., & Steinert-Threlkeld, Z. C. (2022). Image as data: Automated content analysis for visual presentations of political actors and events. Computational Communication Research, 4(1).

Computer vision methodology

Goldstein, Y., Legewie, N. M., & Shiffer-Sebba, D. (2023). 3D Social Research: Analysis of Social Interaction Using Computer Vision. Sociological Methods & Research.
Peng, Y. (2022). Athec: A Python library for computational aesthetic analysis of visual media in social science research. Computational Communication Research, 4(1).
Torres, M., & Cantú, F. (2022). Learning to see: Convolutional neural networks for the analysis of social science data. Political Analysis, 30(1), 113-131.
Zhang, H., & Peng, Y. (2021). Image clustering: An unsupervised approach to categorize visual data in social science research. Sociological Methods & Research.
Araujo, T., Lock, I., & van de Velde, B. (2020). Automated visual content analysis (AVCA) in communication research: A protocol for large scale image classification with pre-trained computer vision models. Communication Methods and Measures.

Bias, ethics, and fairness

Sun, L., Wei, M., Sun, Y., Suh, Y. J., Shen, L., & Yang, S. (2023). Smiling Women Pitching Down: Auditing Representational and Presentational Gender Biases in Image Generative AI. arXiv preprint arXiv:2305.10566.
Schwemmer, C., Knight, C., Bello-Pardo, E. D., Oklobdzija, S., Schoonvelde, M., & Lockhart, J. W. (2020). Diagnosing gender bias in image recognition systems. Socius, 6.
De Vries, T., Misra, I., Wang, C., & Van der Maaten, L. (2019). Does object recognition work for everyone?. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops (pp. 52-59).
Wang, Y., & Kosinski, M. (2018). Deep neural networks are more accurate than humans at detecting sexual orientation from facial images. Journal of Personality and Social Psychology, 114(2), 246–257.
Keyes, O. (2018). The misgendering machines: Trans/HCI implications of automatic gender recognition. Proceedings of the ACM on human-computer interaction, 2(CSCW), 1-22.
Buolamwini, J., & Gebru, T. (2018, January). Gender shades: Intersectional accuracy disparities in commercial gender classification. In Conference on fairness, accountability and transparency (pp. 77-91). PMLR.
Hamidi, F., Scheuerman, M. K., & Branham, S. M. (2018, April). Gender recognition or gender reductionism? The social implications of embedded gender recognition systems. In Proceedings of the 2018 chi conference on human factors in computing systems (pp. 1-13).
Phillips, P. J., Jiang, F., Narvekar, A., Ayyad, J., & O’Toole, A. J. (2011). An other-race effect for face recognition algorithms. ACM Transactions on Applied Perception (TAP), 8(2), 1-11.

Visuals of politicians

Shah, D. V., Sun, Z., Bucy, E. P., Kim, S. J., Sun, Y., Li, M., & Sethares, W. (2023). Building an ICCN Multimodal Classifier of Aggressive Political Debate Style: Towards a Computational Understanding of Candidate Performance Over Time. Communication Methods and Measures, 1-18.
Bossetta, M., & Schmøkel, R. (2023). Cross-platform emotions and audience engagement in social media political campaigning: Comparing candidates’ Facebook and Instagram images in the 2020 US election. Political Communication, 40(1), 48-68.
Dietrich, B. J. (2021). Using motion detection to measure social polarization in the US House of Representatives. Political Analysis, 29(2), 250-259.
Boussalis, C., Coan, T. G., Holman, M. R., & Müller, S. (2021). Gender, candidate emotional expression, and voter reactions during televised debates. American Political Science Review, 115(4), 1242-1257.
Peng, Y. (2020). What makes politicians’ Instagram posts popular? Analyzing social media strategies of candidates and office holders with computer vision. The International Journal of Press/Politics.
Xi, N., Ma, D., Liou, M., Steinert-Threlkeld, Z. C., Anastasopoulos, J., & Joo, J. (2020, May). Understanding the Political Ideology of Legislators from Social Media Images. In Proceedings of the International AAAI Conference on Web and Social Media (pp. 726-737).
Haim, M., & Jungblut, M. (2020). Politicians’ self-depiction and their news portrayal: Evidence from 28 countries using visual computational analysis. Political Communication.
Joo, J., Bucy, E. P., & Seidel, C. (2019). Automated coding of televised leader displays: Detecting nonverbal political behavior with computer vision and deep learning. International Journal of Communication.
Fridkin, K. L., Gershon, S. A., Courey, J., & LaPlant, K. (2019). Gender differences in emotional reactions to the first 2016 presidential debate. Political Behavior, 1–31.
Peng, Y. (2018). Same candidates, different faces: Uncovering media bias in visual portrayals of presidential candidates with computer vision. Journal of Communication, 68(5), 920–941.
Joo, J., Steen, F. F., & Zhu, S. C. (2015). Automated facial trait judgment and election outcome prediction: Social dimensions of face. In Proceedings of the IEEE International Conference on Computer Vision (pp. 3712-3720). IEEE.
Joo, J., Li, W., Steen, F. F., & Zhu, S. C. (2014). Visual persuasion: Inferring communicative intents of images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 216-223). IEEE.
Horiuchi, Y., Komatsu, T., & Nakaya, F. (2012). Should candidates smile to win elections? An application of automated face recognition technology. Political Psychology, 33(6), 925–933.

Visual misinformation, disinformation, and AI-generated media

Peng, Y., Lu, Y., & Shen, C. (2023). An Agenda for Studying Credibility Perceptions of Visual Misinformation. Political Communication, 40(2), 225-237.
Chen, K., Kim, S. J., Gao, Q., & Raschka, S. (2022). Visual framing of science conspiracy videos: Integrating machine learning with communication theories to study the use of color and brightness. Computational Communication Research, 4(1).
Bastos, M., Mercea, D., & Goveia, F. (2023). Guy next door and implausibly attractive young women: The visual frames of social media propaganda. New Media & Society, 25(8), 2014-2033.
Yang, Y., Davis, T., & Hindman, M. (2023). Visual misinformation on Facebook. Journal of Communication.
Nightingale, S. J., & Farid, H. (2022). AI-synthesized faces are indistinguishable from real faces and more trustworthy. Proceedings of the National Academy of Sciences, 119(8), e2120481119.
Groh, M., Epstein, Z., Firestone, C., & Picard, R. (2022). Deepfake detection by human crowds, machines, and machine-informed crowds. Proceedings of the National Academy of Sciences, 119(1), e2110013119.

Social media users and online discourses

Kim, S. J., Villanueva, I. I., & Chen, K. (2023). Going beyond affective polarization: how emotions and identities are used in anti-vaccination TikTok videos. Political Communication, 1-20.
Muise, D., Lu, Y., Pan, J., & Reeves, B. (2022). Selectively localized: Temporal and visual structure of smartphone screen activity across media environments. Mobile Media & Communication, 10(3), 487-509.
Kim, Y., & Kim, J. H. (2018). Using computer vision techniques on Instagram to link users’ personalities and genders to the features of their photos: An exploratory study. Information Processing & Management, 54(6), 1101–1114.
Reece, A. G., & Danforth, C. M. (2017). Instagram photos reveal predictive markers of depression. EPJ Data Science, 6(1), 15.
Manikonda, L., & De Choudhury, M. (2017, May). Modeling and understanding visual attributes of mental health disclosures in social media. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (pp. 170–181). New York, NY: ACM.
Liu, L., Preotiuc-Pietro, D., Samani, Z. R., Moghaddam, M. E., & Ungar, L. (2016). Analyzing personality through social media profile picture choice. In Proceedings of the Tenth International AAAI Conference on Web and Social Media. (pp. 211–220). Cologne, Germany: AAAI.
Abdullah, S., Murnane, E. L., Costa, J. M., & Choudhury, T. (2015, February). Collective smile: Measuring societal happiness from geolocated images. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing (pp. 361–374). New York, NY: ACM.

Virality and popularity

Sharma, M., & Peng, Y. (2023). How visual aesthetics and calorie density predict food image popularity on Instagram: A computer vision analysis. Health Communication, 1-15.
Li, Y., & Xie, Y. (2020). Is a picture worth a thousand words? An empirical study of image content and social media engagement. Journal of Marketing Research, 57(1), 1-19.
Peng, Y. & Jemmott III, J. (2018). Feast for the eyes: Effects of food perceptions and computer vision features on food photo popularity. International Journal of Communication, 12, 313–336.
Bakhshi, S., Shamma, D. A., Kennedy, L., Song, Y., de Juan, P., & Kaye, J. J. (2016, May). Fast, cheap, and good: Why animated GIFs engage us. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (pp. 575–586). New York, NY: ACM.
Gygli, M., & Soleymani, M. (2016, October). Analyzing and predicting GIF interestingness. In Proceedings of the 2016 ACM on Multimedia Conference (pp. 122–126). New York, NY: ACM.
Bakhshi, S., & Gilbert, E. (2015). Red, purple and pink: The colors of diffusion on Pinterest. PLOS ONE, 10(2), e0117148.
Totti, L. C., Costa, F. A., Avila, S., Valle, E., Meira Jr, W., & Almeida, V. (2014). The impact of visual attributes on online image diffusion. In Proceedings of the ACM Conference on Web Science (pp. 42–51). New York, NY: ACM.
Bakhshi, S., Shamma, D. A., & Gilbert, E. (2014). Faces engage us: Photos with faces attract more likes and comments on Instagram. In Proceedings of the 32nd ACM Conference on Human Factors in Computing Systems (pp. 965–974). New York, NY: ACM.

Social movements

Kim, M., & Bas, O. (2023). Seeing the Black Lives Matter Movement Through Computer Vision? An Automated Visual Analysis of News Media Images on Facebook. Social Media+ Society, 9(3).
Steinert-Threlkeld, Z. C., Chan, A. M., & Joo, J. (2022). How state and protester violence affect protest dynamics. The Journal of Politics, 84(2), 798-813.
Sobolev, A., Chen, M. K., Joo, J., & Steinert-Threlkeld, Z. C. (2020). News and geolocated social media accurately measure protest size variation. American Political Science Review, 114(4), 1343-1351.
Zhang, H., & Pan, J. (2019). CASM: A deep-learning approach for identifying collective action events with text and image data from social media. Sociological Methodology, 49(1), 1-57.

Art, aesthetics, and sentiment

Cowen, A. S., Keltner, D., Schroff, F., Jou, B., Adam, H., & Prasad, G. (2021). Sixteen facial expressions occur in similar contexts worldwide. Nature, 589(7841), 251-257.
Iigaya, K., Yi, S., Wahle, I. A., Tanwisuth, K., & O’Doherty, J. P. (2021). Aesthetic preference for art can be predicted from a mixture of low-and high-level visual features. Nature human behaviour, 5(6), 743-755.
Lee, B., Seo, M. K., Kim, D., Shin, I. S., Schich, M., Jeong, H., & Han, S. K. (2020). Dissecting landscape art history with information theory. Proceedings of the National Academy of Sciences.
Machajdik, J., & Hanbury, A. (2010). Affective image classification using features inspired by psychology and art theory. In Proceedings of the ACM International Conference on Multimedia (pp. 83–92). New York, NY: ACM.
Datta, R., Joshi, D., Li, J., & Wang, J. Z. (2006, May). Studying aesthetics in photographic images using a computational approach. In Proceedings of the European Conference on Computer Vision (pp. 288–301). Berlin, Germany: Springer.
Ke, Y., Tang, X., & Jing, F. (2006, June). The design of high-level features for photo quality assessment. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 419–426). New York, NY: IEEE.

Marketing, advertising, and consumer behaviors

Zhang, S., Lee, D., Singh, P. V., & Srinivasan, K. (2022). What makes a good image? Airbnb demand analytics leveraging interpretable image features. Management Science, 68(8), 5644-5666.
Zhou, M., Chen, G. H., Ferreira, P., & Smith, M. D. (2021). Consumer behavior in the online classroom: Using video analytics and machine learning to understand the consumption of video courseware. Journal of Marketing Research, 58(6), 1079-1100.
Matz, S. C., Segalin, C., Stillwell, D., Müller, S. R., & Bos, M. W. (2019). Predicting the personal appeal of marketing images using computational methods. Journal of Consumer Psychology, 29(3), 370-390.

Environment

Naik, N., Kominers, S. D., Raskar, R., Glaeser, E. L., & Hidalgo, C. A. (2017). Computer vision uncovers predictors of physical urban change. Proceedings of the National Academy of Sciences, 114(29), 7571–7576.
Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E. L., & Fei-Fei, L. (2017). Using deep learning and Google Street View to estimate the demographic makeup of neighborhoods across the United States. Proceedings of the National Academy of Sciences, 114(50), 13108–13113.
Jean, N., Burke, M., Xie, M., Davis, W. M., Lobell, D. B., & Ermon, S. (2016). Combining satellite imagery and machine learning to predict poverty. Science, 353(6301), 790–794.