Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • Content Analysis | Guide, Methods & Examples

Content Analysis | Guide, Methods & Examples

Published on July 18, 2019 by Amy Luo . Revised on June 22, 2023.

Content analysis is a research method used to identify patterns in recorded communication. To conduct content analysis, you systematically collect data from a set of texts, which can be written, oral, or visual:

  • Books, newspapers and magazines
  • Speeches and interviews
  • Web content and social media posts
  • Photographs and films

Content analysis can be both quantitative (focused on counting and measuring) and qualitative (focused on interpreting and understanding).  In both types, you categorize or “code” words, themes, and concepts within the texts and then analyze the results.

Table of contents

What is content analysis used for, advantages of content analysis, disadvantages of content analysis, how to conduct content analysis, other interesting articles.

Researchers use content analysis to find out about the purposes, messages, and effects of communication content. They can also make inferences about the producers and audience of the texts they analyze.

Content analysis can be used to quantify the occurrence of certain words, phrases, subjects or concepts in a set of historical or contemporary texts.

Quantitative content analysis example

To research the importance of employment issues in political campaigns, you could analyze campaign speeches for the frequency of terms such as unemployment , jobs , and work  and use statistical analysis to find differences over time or between candidates.

In addition, content analysis can be used to make qualitative inferences by analyzing the meaning and semantic relationship of words and concepts.

Qualitative content analysis example

To gain a more qualitative understanding of employment issues in political campaigns, you could locate the word unemployment in speeches, identify what other words or phrases appear next to it (such as economy,   inequality or  laziness ), and analyze the meanings of these relationships to better understand the intentions and targets of different campaigns.

Because content analysis can be applied to a broad range of texts, it is used in a variety of fields, including marketing, media studies, anthropology, cognitive science, psychology, and many social science disciplines. It has various possible goals:

  • Finding correlations and patterns in how concepts are communicated
  • Understanding the intentions of an individual, group or institution
  • Identifying propaganda and bias in communication
  • Revealing differences in communication in different contexts
  • Analyzing the consequences of communication content, such as the flow of information or audience responses

Prevent plagiarism. Run a free check.

  • Unobtrusive data collection

You can analyze communication and social interaction without the direct involvement of participants, so your presence as a researcher doesn’t influence the results.

  • Transparent and replicable

When done well, content analysis follows a systematic procedure that can easily be replicated by other researchers, yielding results with high reliability .

  • Highly flexible

You can conduct content analysis at any time, in any location, and at low cost – all you need is access to the appropriate sources.

Focusing on words or phrases in isolation can sometimes be overly reductive, disregarding context, nuance, and ambiguous meanings.

Content analysis almost always involves some level of subjective interpretation, which can affect the reliability and validity of the results and conclusions, leading to various types of research bias and cognitive bias .

  • Time intensive

Manually coding large volumes of text is extremely time-consuming, and it can be difficult to automate effectively.

If you want to use content analysis in your research, you need to start with a clear, direct  research question .

Example research question for content analysis

Is there a difference in how the US media represents younger politicians compared to older ones in terms of trustworthiness?

Next, you follow these five steps.

1. Select the content you will analyze

Based on your research question, choose the texts that you will analyze. You need to decide:

  • The medium (e.g. newspapers, speeches or websites) and genre (e.g. opinion pieces, political campaign speeches, or marketing copy)
  • The inclusion and exclusion criteria (e.g. newspaper articles that mention a particular event, speeches by a certain politician, or websites selling a specific type of product)
  • The parameters in terms of date range, location, etc.

If there are only a small amount of texts that meet your criteria, you might analyze all of them. If there is a large volume of texts, you can select a sample .

2. Define the units and categories of analysis

Next, you need to determine the level at which you will analyze your chosen texts. This means defining:

  • The unit(s) of meaning that will be coded. For example, are you going to record the frequency of individual words and phrases, the characteristics of people who produced or appear in the texts, the presence and positioning of images, or the treatment of themes and concepts?
  • The set of categories that you will use for coding. Categories can be objective characteristics (e.g. aged 30-40 ,  lawyer , parent ) or more conceptual (e.g. trustworthy , corrupt , conservative , family oriented ).

Your units of analysis are the politicians who appear in each article and the words and phrases that are used to describe them. Based on your research question, you have to categorize based on age and the concept of trustworthiness. To get more detailed data, you also code for other categories such as their political party and the marital status of each politician mentioned.

3. Develop a set of rules for coding

Coding involves organizing the units of meaning into the previously defined categories. Especially with more conceptual categories, it’s important to clearly define the rules for what will and won’t be included to ensure that all texts are coded consistently.

Coding rules are especially important if multiple researchers are involved, but even if you’re coding all of the text by yourself, recording the rules makes your method more transparent and reliable.

In considering the category “younger politician,” you decide which titles will be coded with this category ( senator, governor, counselor, mayor ). With “trustworthy”, you decide which specific words or phrases related to trustworthiness (e.g. honest and reliable ) will be coded in this category.

4. Code the text according to the rules

You go through each text and record all relevant data in the appropriate categories. This can be done manually or aided with computer programs, such as QSR NVivo , Atlas.ti and Diction , which can help speed up the process of counting and categorizing words and phrases.

Following your coding rules, you examine each newspaper article in your sample. You record the characteristics of each politician mentioned, along with all words and phrases related to trustworthiness that are used to describe them.

5. Analyze the results and draw conclusions

Once coding is complete, the collected data is examined to find patterns and draw conclusions in response to your research question. You might use statistical analysis to find correlations or trends, discuss your interpretations of what the results mean, and make inferences about the creators, context and audience of the texts.

Let’s say the results reveal that words and phrases related to trustworthiness appeared in the same sentence as an older politician more frequently than they did in the same sentence as a younger politician. From these results, you conclude that national newspapers present older politicians as more trustworthy than younger politicians, and infer that this might have an effect on readers’ perceptions of younger people in politics.

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

research content analysis

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Normal distribution
  • Measures of central tendency
  • Chi square tests
  • Confidence interval
  • Quartiles & Quantiles
  • Cluster sampling
  • Stratified sampling
  • Thematic analysis
  • Cohort study
  • Peer review
  • Ethnography

Research bias

  • Implicit bias
  • Cognitive bias
  • Conformity bias
  • Hawthorne effect
  • Availability heuristic
  • Attrition bias
  • Social desirability bias

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Luo, A. (2023, June 22). Content Analysis | Guide, Methods & Examples. Scribbr. Retrieved October 8, 2024, from https://www.scribbr.com/methodology/content-analysis/

Is this article helpful?

Amy Luo

Other students also liked

Qualitative vs. quantitative research | differences, examples & methods, descriptive research | definition, types, methods & examples, reliability vs. validity in research | difference, types and examples, "i thought ai proofreading was useless but..".

I've been using Scribbr for years now and I know it's a service that won't disappoint. It does a good job spotting mistakes”

Skip to content

Read the latest news stories about Mailman faculty, research, and events. 

Departments

We integrate an innovative skills-based curriculum, research collaborations, and hands-on field experience to prepare students.

Learn more about our research centers, which focus on critical issues in public health.

Our Faculty

Meet the faculty of the Mailman School of Public Health. 

Become a Student

Life and community, how to apply.

Learn how to apply to the Mailman School of Public Health. 

Content Analysis

Content analysis is a research tool used to determine the presence of certain words, themes, or concepts within some given qualitative data (i.e. text). Using content analysis, researchers can quantify and analyze the presence, meanings, and relationships of such certain words, themes, or concepts. As an example, researchers can evaluate language used within a news article to search for bias or partiality. Researchers can then make inferences about the messages within the texts, the writer(s), the audience, and even the culture and time of surrounding the text.

Description

Sources of data could be from interviews, open-ended questions, field research notes, conversations, or literally any occurrence of communicative language (such as books, essays, discussions, newspaper headlines, speeches, media, historical documents). A single study may analyze various forms of text in its analysis. To analyze the text using content analysis, the text must be coded, or broken down, into manageable code categories for analysis (i.e. “codes”). Once the text is coded into code categories, the codes can then be further categorized into “code categories” to summarize data even further.

Three different definitions of content analysis are provided below.

Definition 1: “Any technique for making inferences by systematically and objectively identifying special characteristics of messages.” (from Holsti, 1968)

Definition 2: “An interpretive and naturalistic approach. It is both observational and narrative in nature and relies less on the experimental elements normally associated with scientific research (reliability, validity, and generalizability) (from Ethnography, Observational Research, and Narrative Inquiry, 1994-2012).

Definition 3: “A research technique for the objective, systematic and quantitative description of the manifest content of communication.” (from Berelson, 1952)

Uses of Content Analysis

Identify the intentions, focus or communication trends of an individual, group or institution

Describe attitudinal and behavioral responses to communications

Determine the psychological or emotional state of persons or groups

Reveal international differences in communication content

Reveal patterns in communication content

Pre-test and improve an intervention or survey prior to launch

Analyze focus group interviews and open-ended questions to complement quantitative data

Types of Content Analysis

There are two general types of content analysis: conceptual analysis and relational analysis. Conceptual analysis determines the existence and frequency of concepts in a text. Relational analysis develops the conceptual analysis further by examining the relationships among concepts in a text. Each type of analysis may lead to different results, conclusions, interpretations and meanings.

Conceptual Analysis

Typically people think of conceptual analysis when they think of content analysis. In conceptual analysis, a concept is chosen for examination and the analysis involves quantifying and counting its presence. The main goal is to examine the occurrence of selected terms in the data. Terms may be explicit or implicit. Explicit terms are easy to identify. Coding of implicit terms is more complicated: you need to decide the level of implication and base judgments on subjectivity (an issue for reliability and validity). Therefore, coding of implicit terms involves using a dictionary or contextual translation rules or both.

To begin a conceptual content analysis, first identify the research question and choose a sample or samples for analysis. Next, the text must be coded into manageable content categories. This is basically a process of selective reduction. By reducing the text to categories, the researcher can focus on and code for specific words or patterns that inform the research question.

General steps for conducting a conceptual content analysis:

1. Decide the level of analysis: word, word sense, phrase, sentence, themes

2. Decide how many concepts to code for: develop a pre-defined or interactive set of categories or concepts. Decide either: A. to allow flexibility to add categories through the coding process, or B. to stick with the pre-defined set of categories.

Option A allows for the introduction and analysis of new and important material that could have significant implications to one’s research question.

Option B allows the researcher to stay focused and examine the data for specific concepts.

3. Decide whether to code for existence or frequency of a concept. The decision changes the coding process.

When coding for the existence of a concept, the researcher would count a concept only once if it appeared at least once in the data and no matter how many times it appeared.

When coding for the frequency of a concept, the researcher would count the number of times a concept appears in a text.

4. Decide on how you will distinguish among concepts:

Should text be coded exactly as they appear or coded as the same when they appear in different forms? For example, “dangerous” vs. “dangerousness”. The point here is to create coding rules so that these word segments are transparently categorized in a logical fashion. The rules could make all of these word segments fall into the same category, or perhaps the rules can be formulated so that the researcher can distinguish these word segments into separate codes.

What level of implication is to be allowed? Words that imply the concept or words that explicitly state the concept? For example, “dangerous” vs. “the person is scary” vs. “that person could cause harm to me”. These word segments may not merit separate categories, due the implicit meaning of “dangerous”.

5. Develop rules for coding your texts. After decisions of steps 1-4 are complete, a researcher can begin developing rules for translation of text into codes. This will keep the coding process organized and consistent. The researcher can code for exactly what he/she wants to code. Validity of the coding process is ensured when the researcher is consistent and coherent in their codes, meaning that they follow their translation rules. In content analysis, obeying by the translation rules is equivalent to validity.

6. Decide what to do with irrelevant information: should this be ignored (e.g. common English words like “the” and “and”), or used to reexamine the coding scheme in the case that it would add to the outcome of coding?

7. Code the text: This can be done by hand or by using software. By using software, researchers can input categories and have coding done automatically, quickly and efficiently, by the software program. When coding is done by hand, a researcher can recognize errors far more easily (e.g. typos, misspelling). If using computer coding, text could be cleaned of errors to include all available data. This decision of hand vs. computer coding is most relevant for implicit information where category preparation is essential for accurate coding.

8. Analyze your results: Draw conclusions and generalizations where possible. Determine what to do with irrelevant, unwanted, or unused text: reexamine, ignore, or reassess the coding scheme. Interpret results carefully as conceptual content analysis can only quantify the information. Typically, general trends and patterns can be identified.

Relational Analysis

Relational analysis begins like conceptual analysis, where a concept is chosen for examination. However, the analysis involves exploring the relationships between concepts. Individual concepts are viewed as having no inherent meaning and rather the meaning is a product of the relationships among concepts.

To begin a relational content analysis, first identify a research question and choose a sample or samples for analysis. The research question must be focused so the concept types are not open to interpretation and can be summarized. Next, select text for analysis. Select text for analysis carefully by balancing having enough information for a thorough analysis so results are not limited with having information that is too extensive so that the coding process becomes too arduous and heavy to supply meaningful and worthwhile results.

There are three subcategories of relational analysis to choose from prior to going on to the general steps.

Affect extraction: an emotional evaluation of concepts explicit in a text. A challenge to this method is that emotions can vary across time, populations, and space. However, it could be effective at capturing the emotional and psychological state of the speaker or writer of the text.

Proximity analysis: an evaluation of the co-occurrence of explicit concepts in the text. Text is defined as a string of words called a “window” that is scanned for the co-occurrence of concepts. The result is the creation of a “concept matrix”, or a group of interrelated co-occurring concepts that would suggest an overall meaning.

Cognitive mapping: a visualization technique for either affect extraction or proximity analysis. Cognitive mapping attempts to create a model of the overall meaning of the text such as a graphic map that represents the relationships between concepts.

General steps for conducting a relational content analysis:

1. Determine the type of analysis: Once the sample has been selected, the researcher needs to determine what types of relationships to examine and the level of analysis: word, word sense, phrase, sentence, themes. 2. Reduce the text to categories and code for words or patterns. A researcher can code for existence of meanings or words. 3. Explore the relationship between concepts: once the words are coded, the text can be analyzed for the following:

Strength of relationship: degree to which two or more concepts are related.

Sign of relationship: are concepts positively or negatively related to each other?

Direction of relationship: the types of relationship that categories exhibit. For example, “X implies Y” or “X occurs before Y” or “if X then Y” or if X is the primary motivator of Y.

4. Code the relationships: a difference between conceptual and relational analysis is that the statements or relationships between concepts are coded. 5. Perform statistical analyses: explore differences or look for relationships among the identified variables during coding. 6. Map out representations: such as decision mapping and mental models.

Reliability and Validity

Reliability : Because of the human nature of researchers, coding errors can never be eliminated but only minimized. Generally, 80% is an acceptable margin for reliability. Three criteria comprise the reliability of a content analysis:

Stability: the tendency for coders to consistently re-code the same data in the same way over a period of time.

Reproducibility: tendency for a group of coders to classify categories membership in the same way.

Accuracy: extent to which the classification of text corresponds to a standard or norm statistically.

Validity : Three criteria comprise the validity of a content analysis:

Closeness of categories: this can be achieved by utilizing multiple classifiers to arrive at an agreed upon definition of each specific category. Using multiple classifiers, a concept category that may be an explicit variable can be broadened to include synonyms or implicit variables.

Conclusions: What level of implication is allowable? Do conclusions correctly follow the data? Are results explainable by other phenomena? This becomes especially problematic when using computer software for analysis and distinguishing between synonyms. For example, the word “mine,” variously denotes a personal pronoun, an explosive device, and a deep hole in the ground from which ore is extracted. Software can obtain an accurate count of that word’s occurrence and frequency, but not be able to produce an accurate accounting of the meaning inherent in each particular usage. This problem could throw off one’s results and make any conclusion invalid.

Generalizability of the results to a theory: dependent on the clear definitions of concept categories, how they are determined and how reliable they are at measuring the idea one is seeking to measure. Generalizability parallels reliability as much of it depends on the three criteria for reliability.

Advantages of Content Analysis

Directly examines communication using text

Allows for both qualitative and quantitative analysis

Provides valuable historical and cultural insights over time

Allows a closeness to data

Coded form of the text can be statistically analyzed

Unobtrusive means of analyzing interactions

Provides insight into complex models of human thought and language use

When done well, is considered a relatively “exact” research method

Content analysis is a readily-understood and an inexpensive research method

A more powerful tool when combined with other research methods such as interviews, observation, and use of archival records. It is very useful for analyzing historical material, especially for documenting trends over time.

Disadvantages of Content Analysis

Can be extremely time consuming

Is subject to increased error, particularly when relational analysis is used to attain a higher level of interpretation

Is often devoid of theoretical base, or attempts too liberally to draw meaningful inferences about the relationships and impacts implied in a study

Is inherently reductive, particularly when dealing with complex texts

Tends too often to simply consist of word counts

Often disregards the context that produced the text, as well as the state of things after the text is produced

Can be difficult to automate or computerize

Textbooks & Chapters  

Berelson, Bernard. Content Analysis in Communication Research.New York: Free Press, 1952.

Busha, Charles H. and Stephen P. Harter. Research Methods in Librarianship: Techniques and Interpretation.New York: Academic Press, 1980.

de Sola Pool, Ithiel. Trends in Content Analysis. Urbana: University of Illinois Press, 1959.

Krippendorff, Klaus. Content Analysis: An Introduction to its Methodology. Beverly Hills: Sage Publications, 1980.

Fielding, NG & Lee, RM. Using Computers in Qualitative Research. SAGE Publications, 1991. (Refer to Chapter by Seidel, J. ‘Method and Madness in the Application of Computer Technology to Qualitative Data Analysis’.)

Methodological Articles  

Hsieh HF & Shannon SE. (2005). Three Approaches to Qualitative Content Analysis.Qualitative Health Research. 15(9): 1277-1288.

Elo S, Kaarianinen M, Kanste O, Polkki R, Utriainen K, & Kyngas H. (2014). Qualitative Content Analysis: A focus on trustworthiness. Sage Open. 4:1-10.

Application Articles  

Abroms LC, Padmanabhan N, Thaweethai L, & Phillips T. (2011). iPhone Apps for Smoking Cessation: A content analysis. American Journal of Preventive Medicine. 40(3):279-285.

Ullstrom S. Sachs MA, Hansson J, Ovretveit J, & Brommels M. (2014). Suffering in Silence: a qualitative study of second victims of adverse events. British Medical Journal, Quality & Safety Issue. 23:325-331.

Owen P. (2012).Portrayals of Schizophrenia by Entertainment Media: A Content Analysis of Contemporary Movies. Psychiatric Services. 63:655-659.

Choosing whether to conduct a content analysis by hand or by using computer software can be difficult. Refer to ‘Method and Madness in the Application of Computer Technology to Qualitative Data Analysis’ listed above in “Textbooks and Chapters” for a discussion of the issue.

QSR NVivo:  http://www.qsrinternational.com/products.aspx

Atlas.ti:  http://www.atlasti.com/webinars.html

R- RQDA package:  http://rqda.r-forge.r-project.org/

Rolly Constable, Marla Cowell, Sarita Zornek Crawford, David Golden, Jake Hartvigsen, Kathryn Morgan, Anne Mudgett, Kris Parrish, Laura Thomas, Erika Yolanda Thompson, Rosie Turner, and Mike Palmquist. (1994-2012). Ethnography, Observational Research, and Narrative Inquiry. Writing@CSU. Colorado State University. Available at: https://writing.colostate.edu/guides/guide.cfm?guideid=63 .

As an introduction to Content Analysis by Michael Palmquist, this is the main resource on Content Analysis on the Web. It is comprehensive, yet succinct. It includes examples and an annotated bibliography. The information contained in the narrative above draws heavily from and summarizes Michael Palmquist’s excellent resource on Content Analysis but was streamlined for the purpose of doctoral students and junior researchers in epidemiology.

At Columbia University Mailman School of Public Health, more detailed training is available through the Department of Sociomedical Sciences- P8785 Qualitative Research Methods.

Join the Conversation

Have a question about methods? Join us on Facebook

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, automatically generate references for free.

  • Knowledge Base
  • Methodology

Content Analysis | A Step-by-Step Guide with Examples

Published on 5 May 2022 by Amy Luo . Revised on 5 December 2022.

Content analysis is a research method used to identify patterns in recorded communication. To conduct content analysis, you systematically collect data from a set of texts, which can be written, oral, or visual:

  • Books, newspapers, and magazines
  • Speeches and interviews
  • Web content and social media posts
  • Photographs and films

Content analysis can be both quantitative (focused on counting and measuring) and qualitative (focused on interpreting and understanding). In both types, you categorise or ‘code’ words, themes, and concepts within the texts and then analyse the results.

Table of contents

What is content analysis used for, advantages of content analysis, disadvantages of content analysis, how to conduct content analysis.

Researchers use content analysis to find out about the purposes, messages, and effects of communication content. They can also make inferences about the producers and audience of the texts they analyse.

Content analysis can be used to quantify the occurrence of certain words, phrases, subjects, or concepts in a set of historical or contemporary texts.

In addition, content analysis can be used to make qualitative inferences by analysing the meaning and semantic relationship of words and concepts.

Because content analysis can be applied to a broad range of texts, it is used in a variety of fields, including marketing, media studies, anthropology, cognitive science, psychology, and many social science disciplines. It has various possible goals:

  • Finding correlations and patterns in how concepts are communicated
  • Understanding the intentions of an individual, group, or institution
  • Identifying propaganda and bias in communication
  • Revealing differences in communication in different contexts
  • Analysing the consequences of communication content, such as the flow of information or audience responses

Prevent plagiarism, run a free check.

  • Unobtrusive data collection

You can analyse communication and social interaction without the direct involvement of participants, so your presence as a researcher doesn’t influence the results.

  • Transparent and replicable

When done well, content analysis follows a systematic procedure that can easily be replicated by other researchers, yielding results with high reliability .

  • Highly flexible

You can conduct content analysis at any time, in any location, and at low cost. All you need is access to the appropriate sources.

Focusing on words or phrases in isolation can sometimes be overly reductive, disregarding context, nuance, and ambiguous meanings.

Content analysis almost always involves some level of subjective interpretation, which can affect the reliability and validity of the results and conclusions.

  • Time intensive

Manually coding large volumes of text is extremely time-consuming, and it can be difficult to automate effectively.

If you want to use content analysis in your research, you need to start with a clear, direct  research question .

Next, you follow these five steps.

Step 1: Select the content you will analyse

Based on your research question, choose the texts that you will analyse. You need to decide:

  • The medium (e.g., newspapers, speeches, or websites) and genre (e.g., opinion pieces, political campaign speeches, or marketing copy)
  • The criteria for inclusion (e.g., newspaper articles that mention a particular event, speeches by a certain politician, or websites selling a specific type of product)
  • The parameters in terms of date range, location, etc.

If there are only a small number of texts that meet your criteria, you might analyse all of them. If there is a large volume of texts, you can select a sample .

Step 2: Define the units and categories of analysis

Next, you need to determine the level at which you will analyse your chosen texts. This means defining:

  • The unit(s) of meaning that will be coded. For example, are you going to record the frequency of individual words and phrases, the characteristics of people who produced or appear in the texts, the presence and positioning of images, or the treatment of themes and concepts?
  • The set of categories that you will use for coding. Categories can be objective characteristics (e.g., aged 30–40, lawyer, parent) or more conceptual (e.g., trustworthy, corrupt, conservative, family-oriented).

Step 3: Develop a set of rules for coding

Coding involves organising the units of meaning into the previously defined categories. Especially with more conceptual categories, it’s important to clearly define the rules for what will and won’t be included to ensure that all texts are coded consistently.

Coding rules are especially important if multiple researchers are involved, but even if you’re coding all of the text by yourself, recording the rules makes your method more transparent and reliable.

Step 4: Code the text according to the rules

You go through each text and record all relevant data in the appropriate categories. This can be done manually or aided with computer programs, such as QSR NVivo , Atlas.ti , and Diction , which can help speed up the process of counting and categorising words and phrases.

Step 5: Analyse the results and draw conclusions

Once coding is complete, the collected data is examined to find patterns and draw conclusions in response to your research question. You might use statistical analysis to find correlations or trends, discuss your interpretations of what the results mean, and make inferences about the creators, context, and audience of the texts.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

Luo, A. (2022, December 05). Content Analysis | A Step-by-Step Guide with Examples. Scribbr. Retrieved 8 October 2024, from https://www.scribbr.co.uk/research-methods/content-analysis-explained/

Is this article helpful?

Amy Luo

Other students also liked

How to do thematic analysis | guide & examples, data collection methods | step-by-step guide & examples, qualitative vs quantitative research | examples & methods.

Logo for Open Educational Resources

Chapter 17. Content Analysis

Introduction.

Content analysis is a term that is used to mean both a method of data collection and a method of data analysis. Archival and historical works can be the source of content analysis, but so too can the contemporary media coverage of a story, blogs, comment posts, films, cartoons, advertisements, brand packaging, and photographs posted on Instagram or Facebook. Really, almost anything can be the “content” to be analyzed. This is a qualitative research method because the focus is on the meanings and interpretations of that content rather than strictly numerical counts or variables-based causal modeling. [1] Qualitative content analysis (sometimes referred to as QCA) is particularly useful when attempting to define and understand prevalent stories or communication about a topic of interest—in other words, when we are less interested in what particular people (our defined sample) are doing or believing and more interested in what general narratives exist about a particular topic or issue. This chapter will explore different approaches to content analysis and provide helpful tips on how to collect data, how to turn that data into codes for analysis, and how to go about presenting what is found through analysis. It is also a nice segue between our data collection methods (e.g., interviewing, observation) chapters and chapters 18 and 19, whose focus is on coding, the primary means of data analysis for most qualitative data. In many ways, the methods of content analysis are quite similar to the method of coding.

research content analysis

Although the body of material (“content”) to be collected and analyzed can be nearly anything, most qualitative content analysis is applied to forms of human communication (e.g., media posts, news stories, campaign speeches, advertising jingles). The point of the analysis is to understand this communication, to systematically and rigorously explore its meanings, assumptions, themes, and patterns. Historical and archival sources may be the subject of content analysis, but there are other ways to analyze (“code”) this data when not overly concerned with the communicative aspect (see chapters 18 and 19). This is why we tend to consider content analysis its own method of data collection as well as a method of data analysis. Still, many of the techniques you learn in this chapter will be helpful to any “coding” scheme you develop for other kinds of qualitative data. Just remember that content analysis is a particular form with distinct aims and goals and traditions.

An Overview of the Content Analysis Process

The first step: selecting content.

Figure 17.2 is a display of possible content for content analysis. The first step in content analysis is making smart decisions about what content you will want to analyze and to clearly connect this content to your research question or general focus of research. Why are you interested in the messages conveyed in this particular content? What will the identification of patterns here help you understand? Content analysis can be fun to do, but in order to make it research, you need to fit it into a research plan.

New stories Blogs Comment posts Lyrics
Letters to editor Films Cartoons Advertisements
Brand packaging Logos Instagram photos Tweets
Photographs Graffiti Street signs Personalized license plates
Avatars (names, shapes, presentations) Nicknames Band posters Building names

Figure 17.1. A Non-exhaustive List of "Content" for Content Analysis

To take one example, let us imagine you are interested in gender presentations in society and how presentations of gender have changed over time. There are various forms of content out there that might help you document changes. You could, for example, begin by creating a list of magazines that are coded as being for “women” (e.g., Women’s Daily Journal ) and magazines that are coded as being for “men” (e.g., Men’s Health ). You could then select a date range that is relevant to your research question (e.g., 1950s–1970s) and collect magazines from that era. You might create a “sample” by deciding to look at three issues for each year in the date range and a systematic plan for what to look at in those issues (e.g., advertisements? Cartoons? Titles of articles? Whole articles?). You are not just going to look at some magazines willy-nilly. That would not be systematic enough to allow anyone to replicate or check your findings later on. Once you have a clear plan of what content is of interest to you and what you will be looking at, you can begin, creating a record of everything you are including as your content. This might mean a list of each advertisement you look at or each title of stories in those magazines along with its publication date. You may decide to have multiple “content” in your research plan. For each content, you want a clear plan for collecting, sampling, and documenting.

The Second Step: Collecting and Storing

Once you have a plan, you are ready to collect your data. This may entail downloading from the internet, creating a Word document or PDF of each article or picture, and storing these in a folder designated by the source and date (e.g., “ Men’s Health advertisements, 1950s”). Sølvberg ( 2021 ), for example, collected posted job advertisements for three kinds of elite jobs (economic, cultural, professional) in Sweden. But collecting might also mean going out and taking photographs yourself, as in the case of graffiti, street signs, or even what people are wearing. Chaise LaDousa, an anthropologist and linguist, took photos of “house signs,” which are signs, often creative and sometimes offensive, hung by college students living in communal off-campus houses. These signs were a focal point of college culture, sending messages about the values of the students living in them. Some of the names will give you an idea: “Boot ’n Rally,” “The Plantation,” “Crib of the Rib.” The students might find these signs funny and benign, but LaDousa ( 2011 ) argued convincingly that they also reproduced racial and gender inequalities. The data here already existed—they were big signs on houses—but the researcher had to collect the data by taking photographs.

In some cases, your content will be in physical form but not amenable to photographing, as in the case of films or unwieldy physical artifacts you find in the archives (e.g., undigitized meeting minutes or scrapbooks). In this case, you need to create some kind of detailed log (fieldnotes even) of the content that you can reference. In the case of films, this might mean watching the film and writing down details for key scenes that become your data. [2] For scrapbooks, it might mean taking notes on what you are seeing, quoting key passages, describing colors or presentation style. As you might imagine, this can take a lot of time. Be sure you budget this time into your research plan.

Researcher Note

A note on data scraping : Data scraping, sometimes known as screen scraping or frame grabbing, is a way of extracting data generated by another program, as when a scraping tool grabs information from a website. This may help you collect data that is on the internet, but you need to be ethical in how to employ the scraper. A student once helped me scrape thousands of stories from the Time magazine archives at once (although it took several hours for the scraping process to complete). These stories were freely available, so the scraping process simply sped up the laborious process of copying each article of interest and saving it to my research folder. Scraping tools can sometimes be used to circumvent paywalls. Be careful here!

The Third Step: Analysis

There is often an assumption among novice researchers that once you have collected your data, you are ready to write about what you have found. Actually, you haven’t yet found anything, and if you try to write up your results, you will probably be staring sadly at a blank page. Between the collection and the writing comes the difficult task of systematically and repeatedly reviewing the data in search of patterns and themes that will help you interpret the data, particularly its communicative aspect (e.g., What is it that is being communicated here, with these “house signs” or in the pages of Men’s Health ?).

The first time you go through the data, keep an open mind on what you are seeing (or hearing), and take notes about your observations that link up to your research question. In the beginning, it can be difficult to know what is relevant and what is extraneous. Sometimes, your research question changes based on what emerges from the data. Use the first round of review to consider this possibility, but then commit yourself to following a particular focus or path. If you are looking at how gender gets made or re-created, don’t follow the white rabbit down a hole about environmental injustice unless you decide that this really should be the focus of your study or that issues of environmental injustice are linked to gender presentation. In the second round of review, be very clear about emerging themes and patterns. Create codes (more on these in chapters 18 and 19) that will help you simplify what you are noticing. For example, “men as outdoorsy” might be a common trope you see in advertisements. Whenever you see this, mark the passage or picture. In your third (or fourth or fifth) round of review, begin to link up the tropes you’ve identified, looking for particular patterns and assumptions. You’ve drilled down to the details, and now you are building back up to figure out what they all mean. Start thinking about theory—either theories you have read about and are using as a frame of your study (e.g., gender as performance theory) or theories you are building yourself, as in the Grounded Theory tradition. Once you have a good idea of what is being communicated and how, go back to the data at least one more time to look for disconfirming evidence. Maybe you thought “men as outdoorsy” was of importance, but when you look hard, you note that women are presented as outdoorsy just as often. You just hadn’t paid attention. It is very important, as any kind of researcher but particularly as a qualitative researcher, to test yourself and your emerging interpretations in this way.

The Fourth and Final Step: The Write-Up

Only after you have fully completed analysis, with its many rounds of review and analysis, will you be able to write about what you found. The interpretation exists not in the data but in your analysis of the data. Before writing your results, you will want to very clearly describe how you chose the data here and all the possible limitations of this data (e.g., historical-trace problem or power problem; see chapter 16). Acknowledge any limitations of your sample. Describe the audience for the content, and discuss the implications of this. Once you have done all of this, you can put forth your interpretation of the communication of the content, linking to theory where doing so would help your readers understand your findings and what they mean more generally for our understanding of how the social world works. [3]

Analyzing Content: Helpful Hints and Pointers

Although every data set is unique and each researcher will have a different and unique research question to address with that data set, there are some common practices and conventions. When reviewing your data, what do you look at exactly? How will you know if you have seen a pattern? How do you note or mark your data?

Let’s start with the last question first. If your data is stored digitally, there are various ways you can highlight or mark up passages. You can, of course, do this with literal highlighters, pens, and pencils if you have print copies. But there are also qualitative software programs to help you store the data, retrieve the data, and mark the data. This can simplify the process, although it cannot do the work of analysis for you.

Qualitative software can be very expensive, so the first thing to do is to find out if your institution (or program) has a universal license its students can use. If they do not, most programs have special student licenses that are less expensive. The two most used programs at this moment are probably ATLAS.ti and NVivo. Both can cost more than $500 [4] but provide everything you could possibly need for storing data, content analysis, and coding. They also have a lot of customer support, and you can find many official and unofficial tutorials on how to use the programs’ features on the web. Dedoose, created by academic researchers at UCLA, is a decent program that lacks many of the bells and whistles of the two big programs. Instead of paying all at once, you pay monthly, as you use the program. The monthly fee is relatively affordable (less than $15), so this might be a good option for a small project. HyperRESEARCH is another basic program created by academic researchers, and it is free for small projects (those that have limited cases and material to import). You can pay a monthly fee if your project expands past the free limits. I have personally used all four of these programs, and they each have their pluses and minuses.

Regardless of which program you choose, you should know that none of them will actually do the hard work of analysis for you. They are incredibly useful for helping you store and organize your data, and they provide abundant tools for marking, comparing, and coding your data so you can make sense of it. But making sense of it will always be your job alone.

So let’s say you have some software, and you have uploaded all of your content into the program: video clips, photographs, transcripts of news stories, articles from magazines, even digital copies of college scrapbooks. Now what do you do? What are you looking for? How do you see a pattern? The answers to these questions will depend partially on the particular research question you have, or at least the motivation behind your research. Let’s go back to the idea of looking at gender presentations in magazines from the 1950s to the 1970s. Here are some things you can look at and code in the content: (1) actions and behaviors, (2) events or conditions, (3) activities, (4) strategies and tactics, (5) states or general conditions, (6) meanings or symbols, (7) relationships/interactions, (8) consequences, and (9) settings. Table 17.1 lists these with examples from our gender presentation study.

Table 17.1. Examples of What to Note During Content Analysis

What can be noted/coded Example from Gender Presentation Study
Actions and behaviors
Events or conditions
Activities
Strategies and tactics
States/conditions
Meanings/symbols
Relationships/interactions
Consequences
Settings

One thing to note about the examples in table 17.1: sometimes we note (mark, record, code) a single example, while other times, as in “settings,” we are recording a recurrent pattern. To help you spot patterns, it is useful to mark every setting, including a notation on gender. Using software can help you do this efficiently. You can then call up “setting by gender” and note this emerging pattern. There’s an element of counting here, which we normally think of as quantitative data analysis, but we are using the count to identify a pattern that will be used to help us interpret the communication. Content analyses often include counting as part of the interpretive (qualitative) process.

In your own study, you may not need or want to look at all of the elements listed in table 17.1. Even in our imagined example, some are more useful than others. For example, “strategies and tactics” is a bit of a stretch here. In studies that are looking specifically at, say, policy implementation or social movements, this category will prove much more salient.

Another way to think about “what to look at” is to consider aspects of your content in terms of units of analysis. You can drill down to the specific words used (e.g., the adjectives commonly used to describe “men” and “women” in your magazine sample) or move up to the more abstract level of concepts used (e.g., the idea that men are more rational than women). Counting for the purpose of identifying patterns is particularly useful here. How many times is that idea of women’s irrationality communicated? How is it is communicated (in comic strips, fictional stories, editorials, etc.)? Does the incidence of the concept change over time? Perhaps the “irrational woman” was everywhere in the 1950s, but by the 1970s, it is no longer showing up in stories and comics. By tracing its usage and prevalence over time, you might come up with a theory or story about gender presentation during the period. Table 17.2 provides more examples of using different units of analysis for this work along with suggestions for effective use.

Table 17.2. Examples of Unit of Analysis in Content Analysis

Unit of Analysis How Used...
Words
Themes
Characters
Paragraphs
Items
Concepts
Semantics

Every qualitative content analysis is unique in its particular focus and particular data used, so there is no single correct way to approach analysis. You should have a better idea, however, of what kinds of things to look for and what to look for. The next two chapters will take you further into the coding process, the primary analytical tool for qualitative research in general.

Further Readings

Cidell, Julie. 2010. “Content Clouds as Exploratory Qualitative Data Analysis.” Area 42(4):514–523. A demonstration of using visual “content clouds” as a form of exploratory qualitative data analysis using transcripts of public meetings and content of newspaper articles.

Hsieh, Hsiu-Fang, and Sarah E. Shannon. 2005. “Three Approaches to Qualitative Content Analysis.” Qualitative Health Research 15(9):1277–1288. Distinguishes three distinct approaches to QCA: conventional, directed, and summative. Uses hypothetical examples from end-of-life care research.

Jackson, Romeo, Alex C. Lange, and Antonio Duran. 2021. “A Whitened Rainbow: The In/Visibility of Race and Racism in LGBTQ Higher Education Scholarship.” Journal Committed to Social Change on Race and Ethnicity (JCSCORE) 7(2):174–206.* Using a “critical summative content analysis” approach, examines research published on LGBTQ people between 2009 and 2019.

Krippendorff, Klaus. 2018. Content Analysis: An Introduction to Its Methodology . 4th ed. Thousand Oaks, CA: SAGE. A very comprehensive textbook on both quantitative and qualitative forms of content analysis.

Mayring, Philipp. 2022. Qualitative Content Analysis: A Step-by-Step Guide . Thousand Oaks, CA: SAGE. Formulates an eight-step approach to QCA.

Messinger, Adam M. 2012. “Teaching Content Analysis through ‘Harry Potter.’” Teaching Sociology 40(4):360–367. This is a fun example of a relatively brief foray into content analysis using the music found in Harry Potter films.

Neuendorft, Kimberly A. 2002. The Content Analysis Guidebook . Thousand Oaks, CA: SAGE. Although a helpful guide to content analysis in general, be warned that this textbook definitely favors quantitative over qualitative approaches to content analysis.

Schrier, Margrit. 2012. Qualitative Content Analysis in Practice . Thousand Okas, CA: SAGE. Arguably the most accessible guidebook for QCA, written by a professor based in Germany.

Weber, Matthew A., Shannon Caplan, Paul Ringold, and Karen Blocksom. 2017. “Rivers and Streams in the Media: A Content Analysis of Ecosystem Services.” Ecology and Society 22(3).* Examines the content of a blog hosted by National Geographic and articles published in The New York Times and the Wall Street Journal for stories on rivers and streams (e.g., water-quality flooding).

  • There are ways of handling content analysis quantitatively, however. Some practitioners therefore specify qualitative content analysis (QCA). In this chapter, all content analysis is QCA unless otherwise noted. ↵
  • Note that some qualitative software allows you to upload whole films or film clips for coding. You will still have to get access to the film, of course. ↵
  • See chapter 20 for more on the final presentation of research. ↵
  • . Actually, ATLAS.ti is an annual license, while NVivo is a perpetual license, but both are going to cost you at least $500 to use. Student rates may be lower. And don’t forget to ask your institution or program if they already have a software license you can use. ↵

A method of both data collection and data analysis in which a given content (textual, visual, graphic) is examined systematically and rigorously to identify meanings, themes, patterns and assumptions.  Qualitative content analysis (QCA) is concerned with gathering and interpreting an existing body of material.    

Introduction to Qualitative Research Methods Copyright © 2023 by Allison Hurst is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License , except where otherwise noted.

research content analysis

What Is Qualitative Content Analysis?

Qca explained simply (with examples).

By: Jenna Crosley (PhD). Reviewed by: Dr Eunice Rautenbach (DTech) | February 2021

If you’re in the process of preparing for your dissertation, thesis or research project, you’ve probably encountered the term “ qualitative content analysis ” – it’s quite a mouthful. If you’ve landed on this post, you’re probably a bit confused about it. Well, the good news is that you’ve come to the right place…

Overview: Qualitative Content Analysis

  • What (exactly) is qualitative content analysis
  • The two main types of content analysis
  • When to use content analysis
  • How to conduct content analysis (the process)
  • The advantages and disadvantages of content analysis

1. What is content analysis?

Content analysis is a  qualitative analysis method  that focuses on recorded human artefacts such as manuscripts, voice recordings and journals. Content analysis investigates these written, spoken and visual artefacts without explicitly extracting data from participants – this is called  unobtrusive  research.

In other words, with content analysis, you don’t necessarily need to interact with participants (although you can if necessary); you can simply analyse the data that they have already produced. With this type of analysis, you can analyse data such as text messages, books, Facebook posts, videos, and audio (just to mention a few).

The basics – explicit and implicit content

When working with content analysis, explicit and implicit content will play a role. Explicit data is transparent and easy to identify, while implicit data is that which requires some form of interpretation and is often of a subjective nature. Sounds a bit fluffy? Here’s an example:

Joe: Hi there, what can I help you with? 

Lauren: I recently adopted a puppy and I’m worried that I’m not feeding him the right food. Could you please advise me on what I should be feeding? 

Joe: Sure, just follow me and I’ll show you. Do you have any other pets?

Lauren: Only one, and it tweets a lot!

In this exchange, the explicit data indicates that Joe is helping Lauren to find the right puppy food. Lauren asks Joe whether she has any pets aside from her puppy. This data is explicit because it requires no interpretation.

On the other hand, implicit data , in this case, includes the fact that the speakers are in a pet store. This information is not clearly stated but can be inferred from the conversation, where Joe is helping Lauren to choose pet food. An additional piece of implicit data is that Lauren likely has some type of bird as a pet. This can be inferred from the way that Lauren states that her pet “tweets”.

As you can see, explicit and implicit data both play a role in human interaction  and are an important part of your analysis. However, it’s important to differentiate between these two types of data when you’re undertaking content analysis. Interpreting implicit data can be rather subjective as conclusions are based on the researcher’s interpretation. This can introduce an element of bias , which risks skewing your results.

Explicit and implicit data both play an important role in your content analysis, but it’s important to differentiate between them.

2. The two types of content analysis

Now that you understand the difference between implicit and explicit data, let’s move on to the two general types of content analysis : conceptual and relational content analysis. Importantly, while conceptual and relational content analysis both follow similar steps initially, the aims and outcomes of each are different.

Conceptual analysis focuses on the number of times a concept occurs in a set of data and is generally focused on explicit data. For example, if you were to have the following conversation:

Marie: She told me that she has three cats.

Jean: What are her cats’ names?

Marie: I think the first one is Bella, the second one is Mia, and… I can’t remember the third cat’s name.

In this data, you can see that the word “cat” has been used three times. Through conceptual content analysis, you can deduce that cats are the central topic of the conversation. You can also perform a frequency analysis , where you assess the term’s frequency in the data. For example, in the exchange above, the word “cat” makes up 9% of the data. In other words, conceptual analysis brings a little bit of quantitative analysis into your qualitative analysis.

As you can see, the above data is without interpretation and focuses on explicit data . Relational content analysis, on the other hand, takes a more holistic view by focusing more on implicit data in terms of context, surrounding words and relationships.

There are three types of relational analysis:

  • Affect extraction
  • Proximity analysis
  • Cognitive mapping

Affect extraction is when you assess concepts according to emotional attributes. These emotions are typically mapped on scales, such as a Likert scale or a rating scale ranging from 1 to 5, where 1 is “very sad” and 5 is “very happy”.

If participants are talking about their achievements, they are likely to be given a score of 4 or 5, depending on how good they feel about it. If a participant is describing a traumatic event, they are likely to have a much lower score, either 1 or 2.

Proximity analysis identifies explicit terms (such as those found in a conceptual analysis) and the patterns in terms of how they co-occur in a text. In other words, proximity analysis investigates the relationship between terms and aims to group these to extract themes and develop meaning.

Proximity analysis is typically utilised when you’re looking for hard facts rather than emotional, cultural, or contextual factors. For example, if you were to analyse a political speech, you may want to focus only on what has been said, rather than implications or hidden meanings. To do this, you would make use of explicit data, discounting any underlying meanings and implications of the speech.

Lastly, there’s cognitive mapping, which can be used in addition to, or along with, proximity analysis. Cognitive mapping involves taking different texts and comparing them in a visual format – i.e. a cognitive map. Typically, you’d use cognitive mapping in studies that assess changes in terms, definitions, and meanings over time. It can also serve as a way to visualise affect extraction or proximity analysis and is often presented in a form such as a graphic map.

Example of a cognitive map

To recap on the essentials, content analysis is a qualitative analysis method that focuses on recorded human artefacts . It involves both conceptual analysis (which is more numbers-based) and relational analysis (which focuses on the relationships between concepts and how they’re connected).

Need a helping hand?

research content analysis

3. When should you use content analysis?

Content analysis is a useful tool that provides insight into trends of communication . For example, you could use a discussion forum as the basis of your analysis and look at the types of things the members talk about as well as how they use language to express themselves. Content analysis is flexible in that it can be applied to the individual, group, and institutional level.

Content analysis is typically used in studies where the aim is to better understand factors such as behaviours, attitudes, values, emotions, and opinions . For example, you could use content analysis to investigate an issue in society, such as miscommunication between cultures. In this example, you could compare patterns of communication in participants from different cultures, which will allow you to create strategies for avoiding misunderstandings in intercultural interactions.

Another example could include conducting content analysis on a publication such as a book. Here you could gather data on the themes, topics, language use and opinions reflected in the text to draw conclusions regarding the political (such as conservative or liberal) leanings of the publication.

Content analysis is typically used in projects where the research aims involve getting a better understanding of factors such as behaviours, attitudes, values, emotions, and opinions.

4. How to conduct a qualitative content analysis

Conceptual and relational content analysis differ in terms of their exact process ; however, there are some similarities. Let’s have a look at these first – i.e., the generic process:

  • Recap on your research questions
  • Undertake bracketing to identify biases
  • Operationalise your variables and develop a coding scheme
  • Code the data and undertake your analysis

Step 1 – Recap on your research questions

It’s always useful to begin a project with research questions , or at least with an idea of what you are looking for. In fact, if you’ve spent time reading this blog, you’ll know that it’s useful to recap on your research questions, aims and objectives when undertaking pretty much any research activity. In the context of content analysis, it’s difficult to know what needs to be coded and what doesn’t, without a clear view of the research questions.

For example, if you were to code a conversation focused on basic issues of social justice, you may be met with a wide range of topics that may be irrelevant to your research. However, if you approach this data set with the specific intent of investigating opinions on gender issues, you will be able to focus on this topic alone, which would allow you to code only what you need to investigate.

With content analysis, it’s difficult to know what needs to be coded  without a clear view of the research questions.

Step 2 – Reflect on your personal perspectives and biases

It’s vital that you reflect on your own pre-conception of the topic at hand and identify the biases that you might drag into your content analysis – this is called “ bracketing “. By identifying this upfront, you’ll be more aware of them and less likely to have them subconsciously influence your analysis.

For example, if you were to investigate how a community converses about unequal access to healthcare, it is important to assess your views to ensure that you don’t project these onto your understanding of the opinions put forth by the community. If you have access to medical aid, for instance, you should not allow this to interfere with your examination of unequal access.

You must reflect on the preconceptions and biases that you might drag into your content analysis - this is called "bracketing".

Step 3 – Operationalise your variables and develop a coding scheme

Next, you need to operationalise your variables . But what does that mean? Simply put, it means that you have to define each variable or construct . Give every item a clear definition – what does it mean (include) and what does it not mean (exclude). For example, if you were to investigate children’s views on healthy foods, you would first need to define what age group/range you’re looking at, and then also define what you mean by “healthy foods”.

In combination with the above, it is important to create a coding scheme , which will consist of information about your variables (how you defined each variable), as well as a process for analysing the data. For this, you would refer back to how you operationalised/defined your variables so that you know how to code your data.

For example, when coding, when should you code a food as “healthy”? What makes a food choice healthy? Is it the absence of sugar or saturated fat? Is it the presence of fibre and protein? It’s very important to have clearly defined variables to achieve consistent coding – without this, your analysis will get very muddy, very quickly.

When operationalising your variables, you must give every item a clear definition. In other words, what does it mean (include) and what does it not mean (exclude).

Step 4 – Code and analyse the data

The next step is to code the data. At this stage, there are some differences between conceptual and relational analysis.

As described earlier in this post, conceptual analysis looks at the existence and frequency of concepts, whereas a relational analysis looks at the relationships between concepts. For both types of analyses, it is important to pre-select a concept that you wish to assess in your data. Using the example of studying children’s views on healthy food, you could pre-select the concept of “healthy food” and assess the number of times the concept pops up in your data.

Here is where conceptual and relational analysis start to differ.

At this stage of conceptual analysis , it is necessary to decide on the level of analysis you’ll perform on your data, and whether this will exist on the word, phrase, sentence, or thematic level. For example, will you code the phrase “healthy food” on its own? Will you code each term relating to healthy food (e.g., broccoli, peaches, bananas, etc.) with the code “healthy food” or will these be coded individually? It is very important to establish this from the get-go to avoid inconsistencies that could result in you having to code your data all over again.

On the other hand, relational analysis looks at the type of analysis. So, will you use affect extraction? Proximity analysis? Cognitive mapping? A mix? It’s vital to determine the type of analysis before you begin to code your data so that you can maintain the reliability and validity of your research .

research content analysis

How to conduct conceptual analysis

First, let’s have a look at the process for conceptual analysis.

Once you’ve decided on your level of analysis, you need to establish how you will code your concepts, and how many of these you want to code. Here you can choose whether you want to code in a deductive or inductive manner. Just to recap, deductive coding is when you begin the coding process with a set of pre-determined codes, whereas inductive coding entails the codes emerging as you progress with the coding process. Here it is also important to decide what should be included and excluded from your analysis, and also what levels of implication you wish to include in your codes.

For example, if you have the concept of “tall”, can you include “up in the clouds”, derived from the sentence, “the giraffe’s head is up in the clouds” in the code, or should it be a separate code? In addition to this, you need to know what levels of words may be included in your codes or not. For example, if you say, “the panda is cute” and “look at the panda’s cuteness”, can “cute” and “cuteness” be included under the same code?

Once you’ve considered the above, it’s time to code the text . We’ve already published a detailed post about coding , so we won’t go into that process here. Once you’re done coding, you can move on to analysing your results. This is where you will aim to find generalisations in your data, and thus draw your conclusions .

How to conduct relational analysis

Now let’s return to relational analysis.

As mentioned, you want to look at the relationships between concepts . To do this, you’ll need to create categories by reducing your data (in other words, grouping similar concepts together) and then also code for words and/or patterns. These are both done with the aim of discovering whether these words exist, and if they do, what they mean.

Your next step is to assess your data and to code the relationships between your terms and meanings, so that you can move on to your final step, which is to sum up and analyse the data.

To recap, it’s important to start your analysis process by reviewing your research questions and identifying your biases . From there, you need to operationalise your variables, code your data and then analyse it.

Time to analyse

5. What are the pros & cons of content analysis?

One of the main advantages of content analysis is that it allows you to use a mix of quantitative and qualitative research methods, which results in a more scientifically rigorous analysis.

For example, with conceptual analysis, you can count the number of times that a term or a code appears in a dataset, which can be assessed from a quantitative standpoint. In addition to this, you can then use a qualitative approach to investigate the underlying meanings of these and relationships between them.

Content analysis is also unobtrusive and therefore poses fewer ethical issues than some other analysis methods. As the content you’ll analyse oftentimes already exists, you’ll analyse what has been produced previously, and so you won’t have to collect data directly from participants. When coded correctly, data is analysed in a very systematic and transparent manner, which means that issues of replicability (how possible it is to recreate research under the same conditions) are reduced greatly.

On the downside , qualitative research (in general, not just content analysis) is often critiqued for being too subjective and for not being scientifically rigorous enough. This is where reliability (how replicable a study is by other researchers) and validity (how suitable the research design is for the topic being investigated) come into play – if you take these into account, you’ll be on your way to achieving sound research results.

One of the main advantages of content analysis is that it allows you to use a mix of quantitative and qualitative research methods, which results in a more scientifically rigorous analysis.

Recap: Qualitative content analysis

In this post, we’ve covered a lot of ground – click on any of the sections to recap:

If you have any questions about qualitative content analysis, feel free to leave a comment below. If you’d like 1-on-1 help with your qualitative content analysis, be sure to book an initial consultation with one of our friendly Research Coaches.

research content analysis

Psst... there’s more!

This post was based on one of our popular Research Bootcamps . If you're working on a research project, you'll definitely want to check this out ...

19 Comments

Abhishek

If I am having three pre-decided attributes for my research based on which a set of semi-structured questions where asked then should I conduct a conceptual content analysis or relational content analysis. please note that all three attributes are different like Agility, Resilience and AI.

Ofori Henry Affum

Thank you very much. I really enjoyed every word.

Janak Raj Bhatta

please send me one/ two sample of content analysis

pravin

send me to any sample of qualitative content analysis as soon as possible

abdellatif djedei

Many thanks for the brilliant explanation. Do you have a sample practical study of a foreign policy using content analysis?

DR. TAPAS GHOSHAL

1) It will be very much useful if a small but complete content analysis can be sent, from research question to coding and analysis. 2) Is there any software by which qualitative content analysis can be done?

Carkanirta

Common software for qualitative analysis is nVivo, and quantitative analysis is IBM SPSS

carmely

Thank you. Can I have at least 2 copies of a sample analysis study as my reference?

Yang

Could you please send me some sample of textbook content analysis?

Abdoulie Nyassi

Can I send you my research topic, aims, objectives and questions to give me feedback on them?

Bobby Benjamin Simeon

please could you send me samples of content analysis?

Obi Clara Chisom

Yes please send

Gaid Ahmed

really we enjoyed your knowledge thanks allot. from Ethiopia

Ary

can you please share some samples of content analysis(relational)? I am a bit confused about processing the analysis part

eeeema

Is it possible for you to list the journal articles and books or other sources you used to write this article? Thank you.

Upeksha Hettithanthri

can you please send some samples of content analysis ?

can you kindly send some good examples done by using content analysis ?

samuel batimedi

This was very useful. can you please send me sample for qualitative content analysis. thank you

Lawal Ridwan Olalekan

What a brilliant explanation! Kindly help with textbooks or blogs on the context analysis method such as discourse, thematic and semiotic analysis.

Submit a Comment Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

research content analysis

  • Print Friendly

Reference management. Clean and simple.

How to do a content analysis

Content analysis illustration

What is content analysis?

Why would you use a content analysis, types of content analysis, conceptual content analysis, relational content analysis, reliability and validity, reliability, the advantages and disadvantages of content analysis, a step-by-step guide to conducting a content analysis, step 1: develop your research questions, step 2: choose the content you’ll analyze, step 3: identify your biases, step 4: define the units and categories of coding, step 5: develop a coding scheme, step 6: code the content, step 7: analyze the results, frequently asked questions about content analysis, related articles.

In research, content analysis is the process of analyzing content and its features with the aim of identifying patterns and the presence of words, themes, and concepts within the content. Simply put, content analysis is a research method that aims to present the trends, patterns, concepts, and ideas in content as objective, quantitative or qualitative data , depending on the specific use case.

As such, some of the objectives of content analysis include:

  • Simplifying complex, unstructured content.
  • Identifying trends, patterns, and relationships in the content.
  • Determining the characteristics of the content.
  • Identifying the intentions of individuals through the analysis of the content.
  • Identifying the implied aspects in the content.

Typically, when doing a content analysis, you’ll gather data not only from written text sources like newspapers, books, journals, and magazines but also from a variety of other oral and visual sources of content like:

  • Voice recordings, speeches, and interviews.
  • Web content, blogs, and social media content.
  • Films, videos, and photographs.

One of content analysis’s distinguishing features is that you'll be able to gather data for research without physically gathering data from participants. In other words, when doing a content analysis, you don't need to interact with people directly.

The process of doing a content analysis usually involves categorizing or coding concepts, words, and themes within the content and analyzing the results. We’ll look at the process in more detail below.

Typically, you’ll use content analysis when you want to:

  • Identify the intentions, communication trends, or communication patterns of an individual, a group of people, or even an institution.
  • Analyze and describe the behavioral and attitudinal responses of individuals to communications.
  • Determine the emotional or psychological state of an individual or a group of people.
  • Analyze the international differences in communication content.
  • Analyzing audience responses to content.

Keep in mind, though, that these are just some examples of use cases where a content analysis might be appropriate and there are many others.

The key thing to remember is that content analysis will help you quantify the occurrence of specific words, phrases, themes, and concepts in content. Moreover, it can also be used when you want to make qualitative inferences out of the data by analyzing the semantic meanings and interrelationships between words, themes, and concepts.

In general, there are two types of content analysis: conceptual and relational analysis . Although these two types follow largely similar processes, their outcomes differ. As such, each of these types can provide different results, interpretations, and conclusions. With that in mind, let’s now look at these two types of content analysis in more detail.

With conceptual analysis, you’ll determine the existence of certain concepts within the content and identify their frequency. In other words, conceptual analysis involves the number of times a specific concept appears in the content.

Conceptual analysis is typically focused on explicit data, which means you’ll focus your analysis on a specific concept to identify its presence in the content and determine its frequency.

However, when conducting a content analysis, you can also use implicit data. This approach is more involved, complicated, and requires the use of a dictionary, contextual translation rules, or a combination of both.

No matter what type you use, conceptual analysis brings an element of quantitive analysis into a qualitative approach to research.

Relational content analysis takes conceptual analysis a step further. So, while the process starts in the same way by identifying concepts in content, it doesn’t focus on finding the frequency of these concepts, but rather on the relationships between the concepts, the context in which they appear in the content, and their interrelationships.

Before starting with a relational analysis, you’ll first need to decide on which subcategory of relational analysis you’ll use:

  • Affect extraction: With this relational content analysis approach, you’ll evaluate concepts based on their emotional attributes. You’ll typically assess these emotions on a rating scale with higher values assigned to positive emotions and lower values to negative ones. In turn, this allows you to capture the emotions of the writer or speaker at the time the content is created. The main difficulty with this approach is that emotions can differ over time and across populations.
  • Proximity analysis: With this approach, you’ll identify concepts as in conceptual analysis, but you’ll evaluate the way in which they occur together in the content. In other words, proximity analysis allows you to analyze the relationship between concepts and derive a concept matrix from which you’ll be able to develop meaning. Proximity analysis is typically used when you want to extract facts from the content rather than contextual, emotional, or cultural factors.
  • Cognitive mapping: Finally, cognitive mapping can be used with affect extraction or proximity analysis. It’s a visualization technique that allows you to create a model that represents the overall meaning of content and presents it as a graphic map of the relationships between concepts. As such, it’s also commonly used when analyzing the changes in meanings, definitions, and terms over time.

Now that we’ve seen what content analysis is and looked at the different types of content analysis, it’s important to understand how reliable it is as a research method . We’ll also look at what criteria impact the validity of a content analysis.

There are three criteria that determine the reliability of a content analysis:

  • Stability . Stability refers to the tendency of coders to consistently categorize or code the same data in the same way over time.
  • Reproducibility . This criterion refers to the tendency of coders to classify categories membership in the same way.
  • Accuracy . Accuracy refers to the extent to which the classification of content corresponds to a specific standard.

Keep in mind, though, that because you’ll need to code or categorize the concepts you’ll aim to identify and analyze manually, you’ll never be able to eliminate human error. However, you’ll be able to minimize it.

In turn, three criteria determine the validity of a content analysis:

  • Closeness of categories . This is achieved by using multiple classifiers to get an agreed-upon definition for a specific category by using either implicit variables or synonyms. In this way, the category can be broadened to include more relevant data.
  • Conclusions . Here, it’s crucial to decide what level of implication will be allowable. In other words, it’s important to consider whether the conclusions are valid based on the data or whether they can be explained using some other phenomena.
  • Generalizability of the results of the analysis to a theory . Generalizability comes down to how you determine your categories as mentioned above and how reliable those categories are. In turn, this relies on how accurately the categories are at measuring the concepts or ideas that you’re looking to measure.

Considering everything mentioned above, there are definite advantages and disadvantages when it comes to content analysis:

AdvantagesDisadvantages

It doesn’t require physical interaction with any participant, or, in other words, it’s unobtrusive. This means that the presence of a researcher is unlikely to influence the results. As a result, there are also fewer ethical concerns compared to some other analysis methods.

It always involves an element of subjective interpretation. In many cases, it’s criticized for being too subjective and not scientifically rigorous enough. Fortunately, when applying the criteria of reliability and validity, researchers can produce accurate results with content analysis.

It uses a systematic and transparent approach to gathering data. When done correctly, content analysis is easily repeatable by other researchers, which, in turn, leads to more reliable results.

It’s inherently reductive. In other words, by focusing only on specific concepts, words, or themes, researchers will often disregard any context, nuances, or deeper meaning to the content.

Because researchers are able to conduct content analysis in any location, at any time, and at a lower cost compared to many other analysis methods, it’s typically more flexible.

Although it offers researchers an inexpensive and flexible approach to gathering and analyzing data, coding or categorizing a large number of concepts is time-consuming.

It allows researchers to effectively combine quantitative and qualitative analysis into one approach, which then results in a more rigorous scientific analysis of the data.

Coding can be challenging to automate, which means the process largely relies on manual processes.

Let’s now look at the steps you’ll need to follow when doing a content analysis.

The first step will always be to formulate your research questions. This is simply because, without clear and defined research questions, you won’t know what question to answer and, by implication, won’t be able to code your concepts.

Based on your research questions, you’ll then need to decide what content you’ll analyze. Here, you’ll use three factors to find the right content:

  • The type of content . Here you’ll need to consider the various types of content you’ll use and their medium like, for example, blog posts, social media, newspapers, or online articles.
  • What criteria you’ll use for inclusion . Here you’ll decide what criteria you’ll use to include content. This can, for instance, be the mentioning of a certain event or advertising a specific product.
  • Your parameters . Here, you’ll decide what content you’ll include based on specified parameters in terms of date and location.

The next step is to consider your own pre-conception of the questions and identify your biases. This process is referred to as bracketing and allows you to be aware of your biases before you start your research with the result that they’ll be less likely to influence the analysis.

Your next step would be to define the units of meaning that you’ll code. This will, for example, be the number of times a concept appears in the content or the treatment of concept, words, or themes in the content. You’ll then need to define the set of categories you’ll use for coding which can be either objective or more conceptual.

Based on the above, you’ll then organize the units of meaning into your defined categories. Apart from this, your coding scheme will also determine how you’ll analyze the data.

The next step is to code the content. During this process, you’ll work through the content and record the data according to your coding scheme. It’s also here where conceptual and relational analysis starts to deviate in relation to the process you’ll need to follow.

As mentioned earlier, conceptual analysis aims to identify the number of times a specific concept, idea, word, or phrase appears in the content. So, here, you’ll need to decide what level of analysis you’ll implement.

In contrast, with relational analysis, you’ll need to decide what type of relational analysis you’ll use. So, you’ll need to determine whether you’ll use affect extraction, proximity analysis, cognitive mapping, or a combination of these approaches.

Once you’ve coded the data, you’ll be able to analyze it and draw conclusions from the data based on your research questions.

Content analysis offers an inexpensive and flexible way to identify trends and patterns in communication content. In addition, it’s unobtrusive which eliminates many ethical concerns and inaccuracies in research data. However, to be most effective, a content analysis must be planned and used carefully in order to ensure reliability and validity.

The two general types of content analysis: conceptual and relational analysis . Although these two types follow largely similar processes, their outcomes differ. As such, each of these types can provide different results, interpretations, and conclusions.

In qualitative research coding means categorizing concepts, words, and themes within your content to create a basis for analyzing the results. While coding, you work through the content and record the data according to your coding scheme.

Content analysis is the process of analyzing content and its features with the aim of identifying patterns and the presence of words, themes, and concepts within the content. The goal of a content analysis is to present the trends, patterns, concepts, and ideas in content as objective, quantitative or qualitative data, depending on the specific use case.

Content analysis is a qualitative method of data analysis and can be used in many different fields. It is particularly popular in the social sciences.

It is possible to do qualitative analysis without coding, but content analysis as a method of qualitative analysis requires coding or categorizing data to then analyze it according to your coding scheme in the next step.

research content analysis

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

The PMC website is updating on October 15, 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Am J Pharm Educ
  • v.84(1); 2020 Jan

Demystifying Content Analysis

A. j. kleinheksel.

a The Medical College of Georgia at Augusta University, Augusta, Georgia

Nicole Rockich-Winston

Huda tawfik.

b Central Michigan University, College of Medicine, Mt. Pleasant, Michigan

Tasha R. Wyatt

Objective. In the course of daily teaching responsibilities, pharmacy educators collect rich data that can provide valuable insight into student learning. This article describes the qualitative data analysis method of content analysis, which can be useful to pharmacy educators because of its application in the investigation of a wide variety of data sources, including textual, visual, and audio files.

Findings. Both manifest and latent content analysis approaches are described, with several examples used to illustrate the processes. This article also offers insights into the variety of relevant terms and visualizations found in the content analysis literature. Finally, common threats to the reliability and validity of content analysis are discussed, along with suitable strategies to mitigate these risks during analysis.

Summary. This review of content analysis as a qualitative data analysis method will provide clarity and actionable instruction for both novice and experienced pharmacy education researchers.

INTRODUCTION

The Academy’s growing interest in qualitative research indicates an important shift in the field’s scientific paradigm. Whereas health science researchers have historically looked to quantitative methods to answer their questions, this shift signals that a purely positivist, objective approach is no longer sufficient to answer pharmacy education’s research questions. Educators who want to study their teaching and students’ learning will find content analysis an easily accessible, robust method of qualitative data analysis that can yield rigorous results for both publication and the improvement of their educational practice. Content analysis is a method designed to identify and interpret meaning in recorded forms of communication by isolating small pieces of the data that represent salient concepts and then applying or creating a framework to organize the pieces in a way that can be used to describe or explain a phenomenon. 1 Content analysis is particularly useful in situations where there is a large amount of unanalyzed textual data, such as those many pharmacy educators have already collected as part of their teaching practice. Because of its accessibility, content analysis is also an appropriate qualitative method for pharmacy educators with limited experience in educational research. This article will introduce and illustrate the process of content analysis as a way to analyze existing data, but also as an approach that may lead pharmacy educators to ask new types of research questions.

Content analysis is a well-established data analysis method that has evolved in its treatment of textual data. Content analysis was originally introduced as a strictly quantitative method, recording counts to measure the observed frequency of pre-identified targets in consumer research. 1 However, as the naturalistic qualitative paradigm became more prevalent in social sciences research and researchers became increasingly interested in the way people behave in natural settings, the process of content analysis was adapted into a more interesting and meaningful approach. Content analysis has the potential to be a useful method in pharmacy education because it can help educational researchers develop a deeper understanding of a particular phenomenon by providing structure in a large amount of textual data through a systematic process of interpretation. It also offers potential value because it can help identify problematic areas in student understanding and guide the process of targeted teaching. Several research studies in pharmacy education have used the method of content analysis. 2-7 Two studies in particular offer noteworthy examples: Wallman and colleagues employed manifest content analysis to analyze semi-structured interviews in order to explore what students learn during experiential rotations, 7 while Moser and colleagues adopted latent content analysis to evaluate open-ended survey responses on student perceptions of learning communities. 6 To elaborate on these approaches further, we will describe the two types of qualitative content analysis, manifest and latent, and demonstrate the corresponding analytical processes using examples that illustrate their benefit.

Qualitative Content Analysis

Content analysis rests on the assumption that texts are a rich data source with great potential to reveal valuable information about particular phenomena. 8 It is the process of considering both the participant and context when sorting text into groups of related categories to identify similarities and differences, patterns, and associations, both on the surface and implied within. 9-11 The method is considered high-yield in educational research because it is versatile and can be applied in both qualitative and quantitative studies. 12 While it is important to note that content analysis has application in visual and auditory artifacts (eg, an image or song), for our purposes we will largely focus on the most common application, which is the analysis of textual or transcribed content (eg, open-ended survey responses, print media, interviews, recorded observations, etc). The terminology of content analysis can vary throughout quantitative and qualitative literature, which may lead to some confusion among both novice and experienced researchers. However, there are also several agreed-upon terms and phrases that span the literature, as found in Table 1 .

Terms and Definitions Used in Qualitative Content Analysis

An external file that holds a picture, illustration, etc.
Object name is ajpe7113-t1.jpg

There is more often disagreement on terminology in the methodological approaches to content analysis, though the most common differentiation is between the two types of content: manifest and latent. In much of the literature, manifest content analysis is defined as describing what is occurring on the surface, what is and literally present, and as “staying close to the text.” 8,13 Manifest content analysis is concerned with data that are easily observable both to researchers and the coders who assist in their analyses, without the need to discern intent or identify deeper meaning. It is content that can be recognized and counted with little training. Early applications of manifest analysis focused on identifying easily observable targets within text (eg, the number of instances a certain word appears in newspaper articles), film (eg, the occupation of a character), or interpersonal interactions (eg, tracking the number of times a participant blinks during an interview). 14 This application, in which frequency counts are used to understand a phenomenon, reflects a surface-level analysis and assumes there is objective truth in the data that can be revealed with very little interpretation. The number of times a target (ie, code) appears within the text is used as a way to understand its prevalence. Quantitative content analysis is always describing a positivist manifest content analysis, in that the nature of truth is believed to be objective, observable, and measurable. Qualitative research, which favors the researcher’s interpretation of an individual’s experience, may also be used to analyze manifest content. However, the intent of the application is to describe a dynamic reality that cannot be separated from the lived experiences of the researcher. Although qualitative content analysis can be conducted whether knowledge is thought to be innate, acquired, or socially constructed, the purpose of qualitative manifest content analysis is to transcend simple word counts and delve into a deeper examination of the language in order to organize large amounts of text into categories that reflect a shared meaning. 15,16 The practical distinction between quantitative and qualitative manifest content analysis is the intention behind the analysis. The quantitative method seeks to generate a numerical value to either cite prevalence or use in statistical analyses, while the qualitative method seeks to identify a construct or concept within the text using specific words or phrases for substantiation, or to provide a more organized structure to the text being described.

Latent content analysis is most often defined as interpreting what is hidden deep within the text. In this method, the role of the researcher is to discover the implied meaning in participants’ experiences. 8,13 For example, in a transcribed exchange in an office setting, a participant might say to a coworker, “Yeah, here we are…another Monday. So exciting!” The researcher would apply context in order to discover the emotion being conveyed (ie, the implied meaning). In this example, the comment could be interpreted as genuine, it could be interpreted as a sarcastic comment made in an attempt at humor in order to develop or sustain social bonds with the coworker, or the context might imply that the sarcasm was meant to convey displeasure and end the interaction.

Latent content analysis acknowledges that the researcher is intimately involved in the analytical process and that the their role is to actively use mental schema, theories, and lenses to interpret and understand the data. 10 Whereas manifest analyses are typically conducted in a way that the researcher is thought to maintain distance and separation from the objects of study, latent analyses underscore the importance of the researcher co-creating meaning with the text. 17 Adding nuance to this type of content, Potter and Levine‐Donnerstein argue that within latent content analysis, there are two distinct types: latent pattern and latent projective . 14 Latent pattern content analysis seeks to establish a pattern of characteristics in the text itself, while latent projective content analysis leverages the researcher’s own interpretations of the meaning of the text. While both approaches rely on codes that emerge from the content using the coder’s own perspectives and mental schema, the distinction between these two types of analyses are in their foci. 14 Though we do not agree, some researchers believe that all qualitative content analysis is latent content analysis. 11 These disagreements typically occur where there are differences in intent and where there are areas of overlap in the results. For example, both qualitative manifest and latent pattern content analyses may identify patterns as a result of their application. Though in their research design, the researcher would have approached the content with different methodological approaches, with a manifest approach seeking only to describe what is observed, and the latent pattern approach seeking to discover an unseen pattern. At this point, these distinctions may seem too philosophical to serve a practical purpose, so we will attempt to clarify these concepts by presenting three types of analyses for illustrative purposes, beginning with a description of how codes are created and used.

Creating and Using Codes

Codes are the currency of content analysis. Researchers use codes to organize and understand their data. Through the coding process, pharmacy educators can systematically and rigorously categorize and interpret vast amounts of text for use in their educational practice or in publication. Codes themselves are short, descriptive labels that symbolically assign a summative or salient attribute to more than one unit of meaning identified in the text. 18 To create codes, a researcher must first become immersed in the data, which typically occurs when a researcher transcribes recorded data or conducts several readings of the text. This process allows the researcher to become familiar with the scope of the data, which spurs nascent ideas about potential concepts or constructs that may exist within it. If studying a phenomenon that has already been described through an existing framework, codes can be created a priori using theoretical frameworks or concepts identified in the literature. If there is no existing framework to apply, codes can emerge during the analytical process. However, emergent codes can also be created as addenda to a priori codes that were identified before the analysis begins if the a priori codes do not sufficiently capture the researcher’s area of interest.

The process of detecting emergent codes begins with identification of units of meaning. While there is no one way to decide what qualifies as a meaning unit, researchers typically define units of meaning differently depending on what kind of analysis is being conducted. As a general rule, when dialogue is being analyzed, such as interviews or focus groups, meaning units are identified as conversational turns, though a code can be as short as one or two words. In written text, such as student reflections or course evaluation data, the researcher must decide if the text should be divided into phrases or sentences, or remain as paragraphs. This decision is usually made based on how many different units of meaning are expressed in a block of text. For example, in a paragraph, if there are several thoughts or concepts being expressed, it is best to break up the paragraph into sentences. If one sentence contains multiple ideas of interest, making it difficult to separate one important thought or behavior from another, then the sentence can be divided into smaller units, such as phrases or sentence fragments. These phrases or sentence fragments are then coded as separate meaning units. Conversely, longer or more complex units of meaning should be condensed into shorter representations that still retain the original meaning in order to reduce the cognitive burden of the analytical process. This could entail removing verbal ticks (eg, “well, uhm…”) from transcribed data or simplifying a compound sentence. Condensation does not ascribe interpretation or implied meaning to a unit, but only shortens a meaning unit as much as possible while preserving the original meaning identified. 18 After condensation, a researcher can proceed to the creation of codes.

Many researchers begin their analyses with several general codes in mind that help guide their focus as defined by their research question, even in instances where the researcher has no a priori model or theory. For example, if a group of instructors are interested in examining recorded videos of their lectures to identify moments of student engagement, they may begin with using generally agreed upon concepts of engagement as codes, such as students “raising their hands,” “taking notes,” and “speaking in class.” However, as the instructors continue to watch their videos, they may notice other behaviors which were not initially anticipated. Perhaps students were seen creating flow charts based on information presented in class. Alternatively, perhaps instructors wanted to include moments when students posed questions to their peers without being prompted. In this case, the instructors would allow the codes of “creating graphic organizers” and “questioning peers” to emerge as additional ways to identify the behavior of student engagement.

Once a researcher has identified condensed units of meaning and labeled them with codes, the codes are then sorted into categories which can help provide more structure to the data. In the above example of recorded lectures, perhaps the category of “verbal behaviors” could be used to group the codes of “speaking in class” and “questioning peers.” For complex analyses, subcategories can also be used to better organize a large amount of codes, but solely at the discretion of the researcher. Two or more categories of codes are then used to identify or support a broader underlying meaning which develops into themes. Themes are most often employed in latent analyses; however, they are appropriate in manifest analyses as well. Themes describe behaviors, experiences, or emotions that occur throughout several categories. 18 Figure 1 illustrates this process. Using the same videotaped lecture example, the instructors might identify two themes of student engagement, “active engagement” and “passive engagement,” where active engagement is supported by the category of “verbal behavior” and also a category that includes the code of “raising their hands” (perhaps something along the lines of “pursuing engagement”), and the theme of “passive engagement” is supported by a category used to organize the behaviors of “taking notes” and “creating graphic organizers.”

An external file that holds a picture, illustration, etc.
Object name is ajpe7113-fig1.jpg

The Process of Qualitative Content Analysis

To more fully demonstrate the process of content analysis and the generation and use of codes, categories, and themes, we present and describe examples of both manifest and latent content analysis. Given that there are multiple ways to create and use codes, our examples illustrate both processes of creating and using a predetermined set of codes. Regardless of the kind of content analysis instructors want to conduct, the initial steps are the same. The instructor must analyze the data using codes as a sense-making process.

Manifest Content Analysis

The first form of analysis, manifest content analysis, examines text for elements that exist on the surface of the text, the meaning of which is taken at face value. Schools and colleges of pharmacy may benefit from conducting manifest content analyses at a programmatic level, including analysis of student evaluations to determine the value of certain courses, or analysis of recruitment materials for addressing issues of cultural humility in a uniform manner. Such uses for manifest content analysis may help administrators make more data-based decisions about students and courses. However, for our example of manifest content analysis, we illustrate the use of content analysis in informing instruction for a single pharmacy educator ( Figure 2 ).

An external file that holds a picture, illustration, etc.
Object name is ajpe7113-fig2.jpg

A Student’s Completed Beta-blocker Case with Codes in Underlined Bold Text

In the example, a pharmacology instructor is trying to assess students’ understanding of three concepts related to the beta-blocker class of drugs: indication of the drug, relevance of family history, and contraindications and precautions. To do so, the instructor asks the students to write a patient case in which beta-blockers are indicated. The instructor gives the students the following prompt: “Reverse-engineer a case in which beta-blockers would be prescribed to the patient. Include a history of the present illness, the patients’ medical, family, and social history, medications, allergies, and relevant lab tests.” Figure 2 is a hypothetical student’s completed assignment, in which they demonstrate their understanding of when and why a beta-blocker would be prescribed.

The student-generated cases are then treated as data and analyzed for the presence of the three previously identified indicators of understanding in order to help the instructor make decisions about where and how to focus future teaching efforts related to this drug class. Codes are created a priori out of the instructor’s interest in analyzing students’ understanding of the concepts related to beta-blocker prescriptions. A codebook ( Table 2 ) is created with the following columns: name of code, code description, and examples of the code. This codebook helps an individual researcher to approach their analysis systematically, but it can also facilitate coding by multiple coders who would apply the same rules outlined in the codebook to the coding process.

Example Code Book Created for Manifest Content Analysis

An external file that holds a picture, illustration, etc.
Object name is ajpe7113-t2.jpg

Using multiple coders introduces complexity to the analysis process, but it is oftentimes the only practical way to analyze large amounts of data. To ensure that all coders are working in tandem, they must establish inter-rater reliability as part of their training process. This process requires that a single form of text be selected, such as one student evaluation. After reviewing the codebook and receiving instruction, everyone on the team individually codes the same piece of data. While calculating percentage agreement has sometimes been used to establish inter-rater reliability, most publication editors require more rigorous statistical analysis (eg, Krippendorf’s alpha, or Cohen’s kappa). 19 Detailed descriptions of these statistics fall outside the scope of this introduction, but it is important to note that the choice depends on the number of coders, the sample size, and the type of data to be analyzed.

Latent Content Analysis

Latent content analysis is another option for pharmacy educators, especially when there are theoretical frameworks or lenses the educator proposes to apply. Such frameworks describe and provide structure to complex concepts and may often be derived from relevant theories. Latent content analysis requires that the researcher is intimately involved in interpreting and finding meaning in the text because meaning is not readily apparent on the surface. 10 To illustrate a latent content analysis using a combination of a priori and emergent codes, we will use the example of a transcribed video excerpt from a student pharmacist interaction with a standardized patient. In this example, the goal is for first-year students to practice talking to a customer about an over-the-counter medication. The case is designed to simulate a customer at a pharmacy counter, who is seeking advice on a medication. The learning objectives for the pharmacist in-training are to assess the customer’s symptoms, determine if the customer can self-treat or if they need to seek out their primary care physician, and then prescribe a medication to alleviate the patient’s symptoms.

To begin, pharmacy educators conducting educational research should first identify what they are looking for in the video transcript. In this case, because the primary outcome for this exercise is aimed at assessing the “soft skills” of student pharmacists, codes are created using the counseling rubric created by Horton and colleagues. 20 Four a priori codes are developed using the literature: empathy, patient-friendly terms, politeness, and positive attitude. However, because the original four codes are inadequate to capture all areas representing the skills the instructor is looking for during the process of analysis, four additional codes are also created: active listening, confidence, follow-up, and patient at ease. Figure 3 presents the video transcript with each of the codes assigned to the meaning units in bolded parentheses.

An external file that holds a picture, illustration, etc.
Object name is ajpe7113-fig3.jpg

A Transcript of a Student’s (JR) Experience with a Standardized Patient (SP) in Which the Codes are Bolded in Parentheses

Following the initial coding using these eight codes, the codes are consolidated to create categories, which are depicted in the taxonomy in Figure 4 . Categories are relationships between codes that represent a higher level of abstraction in the data. 18 To reach conclusions and interpret the fundamental underlying meaning in the data, categories are then organized into themes ( Figure 1 ). Once the data are analyzed, the instructor can assign value to the student’s performance. In this case, the coding process determines that the exercise demonstrated both positive and negative elements of communication and professionalism. Under the category of professionalism, the student generally demonstrated politeness and a positive attitude toward the standardized patient, indicating to the reviewer that the theme of perceived professionalism was apparent during the encounter. However, there were several instances in which confidence and appropriate follow-up were absent. Thus, from a reviewer perspective, the student's performance could be perceived as indicating an opportunity to grow and improve as a future professional. Typically, there are multiple codes in a category and multiple categories in a theme. However, as seen in the example taxonomy, this is not always the case.

An external file that holds a picture, illustration, etc.
Object name is ajpe7113-fig4.jpg

Example of a Latent Content Analysis Taxonomy

If the educator is interested in conducting a latent projective analysis, after identifying the construct of “soft skills,” the researcher allows for each coder to apply their own mental schema as they look for positive and negative indicators of the non-technical skills they believe a student should develop. Mental schema are the cognitive structures that provide organization to knowledge, which in this case allows coders to categorize the data in ways that fit their existing understanding of the construct. The coders will use their own judgement to identify the codes they feel are relevant. The researcher could also choose to apply a theoretical lens to more effectively conceptualize the construct of “soft skills,” such as Rogers' humanism theory, and more specifically, concepts underlying his client-centered therapy. 21 The role of theory in both latent pattern and latent projective analyses is at the discretion of the researcher, and often is determined by what already exists in the literature related to the research question. Though, typically, in latent pattern analyses theory is used for deductive coding, and in latent projective analyses underdeveloped theory is used to first deduce codes and then for induction of the results to strengthen the theory applied. For our example, Rogers describes three salient qualities to develop and maintain a positive client-professional relationship: unconditional positive regard, genuineness, and empathetic understanding. 21 For the third element, specifically, the educator could look for units of meaning that imply empathy and active listening. For our video transcript analysis, this is evident when the student pharmacist demonstrated empathy by responding, "Yeah, I understand," when discussing aggravating factors for the patient's condition. The outcome for both latent pattern and latent projective content analysis is to discover the underlying meaning in a text, such as social rules or mental models. In this example, both pattern and projective approaches can discover interpreted aspects of a student’s abilities and mental models for constructs such as professionalism and empathy. The difference in the approaches is where the precedence lies: in the belief that a pattern is recognizable in the content, or in the mental schema and lived experiences of the coder(s). To better illustrate the differences in the processes of latent pattern and projective content analyses, Figure 5 presents a general outline of each method beginning with the creation of codes and concluding with the generation of themes.

An external file that holds a picture, illustration, etc.
Object name is ajpe7113-fig5.jpg

Flow Chart of the Stages of Latent Pattern and Latent Projective Content Analysis

How to Choose a Methodological Approach to Content Analysis

To determine which approach a researcher should take in their content analysis, two decisions need to be made. First, researchers must determine their goal for the analysis. Second, the researcher must decide where they believe meaning is located. 14 If meaning is located in the discrete elements of the content that are easily identified on the surface of the text, then manifest content analysis is appropriate. If meaning is located deep within the content and the researcher plans to discover context cues and make judgements about implied meaning, then latent content analysis should be applied. When designing the latent content analysis, a researcher then must also identify their focus. If the analysis is intended to identify a recognizable truth within the content by uncovering connections and characteristics that all coders should be able to discover, then latent pattern content analysis is appropriate. If, on the other hand, the researcher will rely heavily on the judgment of the coders and believes that interpretation of the content must leverage the mental schema of the coders to locate deeper meaning, then latent projective content analysis is the best choice.

To demonstrate how a researcher might choose a methodological approach, we have presented a third example of data in Figure 6 . In our two previous examples of content analysis, we used student data. However, faculty data can also be analyzed as part of educational research or for faculty members to improve their own teaching practices. Recall in the video data analyzed using latent content analysis, the student was tasked to identify a suitable over-the-counter medication for a patient complaining of heartburn symptoms. We have extended this example by including an interview with the pharmacy educator supervising the student who was videotaped. The goal of the interview is to evaluate the educator’s ability to assess the student’s performance with the standardized patient. Figure 6 is an excerpt of the interview between the course instructor and an instructional coach. In this conversation, the instructional coach is eliciting evidence to support the faculty member’s views, judgements, and rationale for the educator’s evaluation of the student’s performance.

An external file that holds a picture, illustration, etc.
Object name is ajpe7113-fig6.jpg

A Transcript of an Interview in Which the Interviewer (IN) Questions a Faculty Member (FM) Regarding Their Student’s Standardized Patient Experience

Manifest content analysis would be a valid choice for this data if the researcher was looking to identify evidence of the construct of “instructor priorities” and defined discrete codes that described aspects of performance such as “communication,” “referrals,” or “accurate information.” These codes could be easily identified on the surface of the transcribed interview by identifying keywords related to each code, such as “communicate,” “talk,” and “laugh,” for the code of “communication.” This would allow coders to identify evidence of the concept of “instructor priorities” by sorting through a potentially large amount of text with predetermined targets in mind.

To conduct a latent pattern analysis of this interview, researchers would first immerse themselves in the data to identify a theoretical framework or concepts that represent the area of interest so that coders could discover an emerging truth underneath the surface of the data. After immersion in the data, a researcher might believe it would be interesting to more closely examine the strategies the coach uses to establish rapport with the instructor as a way to better understand models of professional development. These strategies could not be easily identified in the transcripts if read literally, but by looking for connections within the text, codes related to instructional coaching tactics emerge. A latent pattern analysis would require that the researcher code the data in a way that looks for patterns, such as a code of “facilitating reflection,” that could be identified in open-ended questions and other units of meaning where the coder saw evidence of probing techniques, or a code of “establishing rapport” for which a coder could identify nonverbal cues such as “[IN leans forward in chair].”

Conducting latent projective content analysis might be useful if the researcher was interested in using a broader theoretical lens, such as Mezirow’s theory of transformative learning. 22 In this example, the faculty member is understood to have attempted to change a learner’s frame of reference by facilitating cognitive dissonance or a disorienting experience through a standardized patient simulation. To conduct a latent projective analysis, the researcher could analyze the faculty member’s interview using concepts found in this theory. This kind of analysis will help the researcher assess the level of change that the faculty member was able to perceive, or expected to witness, in their attempt to help their pharmacy students improve their interactions with patients. The units of meaning and subsequent codes would rely on the coders to apply their own knowledge of transformative learning because of the absence in the theory of concrete, context-specific behaviors to identify. For this analysis, the researcher would rely on their interpretations of what challenging educational situations look like, what constitutes cognitive dissonance, or what the faculty member is really expecting from his students’ performance. The subsequent analysis could provide evidence to support the use of such standardized patient encounters within the curriculum as a transformative learning experience and would also allow the educator to self-reflect on his ability to assess simulated activities.

OTHER ASPECTS TO CONSIDER

Navigating terminology.

Among the methodological approaches, there are other terms for content analysis that researchers may come across. Hsieh and Shannon 10 proposed three qualitative approaches to content analysis: conventional, directed, and summative. These categories were intended to explain the role of theory in the analysis process. In conventional content analysis, the researcher does not use preconceived categories because existing theory or literature are limited. In directed content analysis, the researcher attempts to further describe a phenomenon already addressed by theory, applying a deductive approach and using identified concepts or codes from exiting research to validate the theory. In summative content analysis, a descriptive approach is taken, identifying and quantifying words or content in order to describe their context. These three categories roughly map to the terms of latent projective, latent pattern, and manifest content analyses respectively, though not precisely enough to suggest that they are synonyms.

Graneheim and colleagues 9 reference the inductive, deductive, and abductive methods of interpretation of content analysis, which are data-driven, concept-driven, and fluid between both data and concepts, respectively. Where manifest content produces phenomenological descriptions most often (but not always) through deductive interpretation, and latent content analysis produces interpretations most often (but not always) through inductive or abductive interpretations. Erlingsson and Brysiewicz 23 refer to content analysis as a continuum, progressing as the researcher develops codes, then categories, and then themes. We present these alternative conceptualizations of content analysis to illustrate that the literature on content analysis, while incredibly useful, presents a multitude of interpretations of the method itself. However, these complexities should not dissuade readers from using content analysis. Identifying what you want to know (ie, your research question) will effectively direct you toward your methodological approach. That said, we have found the most helpful aid in learning content analysis is the application of the methods we have presented.

Ensuring Quality

The standards used to evaluate quantitative research are seldom used in qualitative research. The terms “reliability” and “validity” are typically not used because they reflect the positivist quantitative paradigm. In qualitative research, the preferred term is “trustworthiness,” which is comprised of the concepts of credibility, transferability, dependability, and confirmability, and researchers can take steps in their work to demonstrate that they are trustworthy. 24 Though establishing trustworthiness is outside the scope of this article, novice researchers should be familiar with the necessary steps before publishing their work. This suggestion includes exploration of the concept of saturation, the idea that researchers must demonstrate they have collected and analyzed enough data to warrant their conclusions, which has been a focus of recent debate in qualitative research. 25

There are several threats to the trustworthiness of content analysis in particular. 14 We will use the terms “reliability and validity” to describe these threats, as they are conceptualized this way in the formative literature, and it may be easier for researchers with a quantitative research background to recognize them. Though some of these threats may be particular to the type of data being analyzed, in general, there are risks specific to the different methods of content analysis. In manifest content analysis, reliability is necessary but not sufficient to establish validity. 14 Because there is little judgment required of the coders, lack of high inter-rater agreement among coders will render the data invalid. 14 Additionally, coder fatigue is a common threat to manifest content analysis because the coding is clerical and repetitive in nature.

For latent pattern content analysis, validity and reliability are inversely related. 14 Greater reliability is achieved through more detailed coding rules to improve consistency, but these rules may diminish the accessibility of the coding to consumers of the research. This is defined as low ecological validity. Higher ecological validity is achieved through greater reliance on coder judgment to increase the resonance of the results with the audience, yet this often decreases the inter-rater reliability. In latent projective content analysis, reliability and validity are equivalent. 14 Consistent interpretations among coders both establishes and validates the constructed norm; construction of an accurate norm is evidence of consistency. However, because of this equivalence, issues with low validity or low reliability cannot be isolated. A lack of consistency may result from coding rules, lack of a shared schema, or issues with a defined variable. Reasons for low validity cannot be isolated, but will always result in low consistency.

Any good analysis starts with a codebook and coder training. It is important for all coders to share the mental model of the skill, construct, or phenomenon being coded in the data. However, when conducting latent pattern or projective content analysis in particular, micro-level rules and definitions of codes increase the threat of ecological validity, so it is important to leave enough room in the codebook and during the training to allow for a shared mental schema to emerge in the larger group rather than being strictly directed by the lead researcher. Stability is another threat, which occurs when coders make different judgments as time passes. To reduce this risk, allowing for recoding at a later date can increase the consistency and stability of the codes. Reproducibility is not typically a goal of qualitative research, 15 but for content analysis, codes that are defined both prior to and during analysis should retain their meaning. Researchers can increase the reproducibility of their codebook by creating a detailed audit trail, including descriptions of the methods used to create and define the codes, materials used for the training of the coders, and steps taken to ensure inter-rater reliability.

In all forms of qualitative analysis, coder fatigue is a common threat to trustworthiness, even when the instructor is coding individually. Over time, the cases may start to look the same, making it difficult to refocus and look at each case with fresh eyes. To guard against this, coders should maintain a reflective journal and write analytical memos to help stay focused. Memos might include insights that the researcher has, such as patterns of misunderstanding, areas to focus on when considering re-teaching specific concepts, or specific conversations to have with students. Fatigue can also be mitigated by occasionally talking to participants (eg, meeting with students and listening for their rationale on why they included specific pieces of information in an assignment). These are just examples of potential exercises that can help coders mitigate cognitive fatigue. Most researchers develop their own ways to prevent the fatigue that can seep in after long hours of looking at data. But above all, a sufficient amount of time should be allowed for analysis, so that coders do not feel rushed, and regular breaks should be scheduled and enforced.

Qualitative content analysis is both accessible and high-yield for pharmacy educators and researchers. Though some of the methods may seem abstract or fluid, the nature of qualitative content analysis encompasses these concerns by providing a systematic approach to discover meaning in textual data, both on the surface and implied beneath it. As with most research methods, the surest path towards proficiency is through application and intentional, repeated practice. We encourage pharmacy educators to ask questions suited for qualitative research and to consider the use of content analysis as a qualitative research method for discovering meaning in their data.

IMAGES

  1. Content Analysis

    research content analysis

  2. CONTENT ANALYSIS

    research content analysis

  3. CONTENT ANALYSIS

    research content analysis

  4. Content Analysis For Research

    research content analysis

  5. Content Analysis

    research content analysis

  6. What is directed content analysis (DQICA) in qualitative research?

    research content analysis

VIDEO

  1. Analysis of Data? Some Examples to Explore

  2. CONTENT ANALYSIS

  3. What is Qualitative Research

  4. Chapter-4 of a Research Thesis-Findings/Results and Analysis: Structure and Contents

  5. Cara melakukan preprocessing data VERSI#4

  6. 09 Unobtrusive Research Content Analysis, Comparative And Historical Research

COMMENTS

  1. Content Analysis | Guide, Methods & Examples - Scribbr

    Researchers use content analysis to find out about the purposes, messages, and effects of communication content. They can also make inferences about the producers and audience of the texts they analyze.

  2. Content Analysis Method and Examples | Columbia Public Health

    Content analysis is a research tool used to determine the presence of certain words, themes, or concepts within some given qualitative data (i.e. text). Using content analysis, researchers can quantify and analyze the presence, meanings, and relationships of such certain words, themes, or concepts. As an example, researchers can evaluate ...

  3. Content Analysis | A Step-by-Step Guide with Examples - Scribbr

    Content analysis is a research method used to identify patterns in recorded communication. To conduct content analysis, you systematically collect data from a set of texts, which can be written, oral, or visual: Books, newspapers, and magazines. Speeches and interviews. Web content and social media posts. Photographs and films.

  4. Chapter 17. Content Analysis – Introduction to Qualitative ...

    This chapter will explore different approaches to content analysis and provide helpful tips on how to collect data, how to turn that data into codes for analysis, and how to go about presenting what is found through analysis.

  5. A hands-on guide to doing content analysis - PMC

    Qualitative research is useful to deepen the understanding of the human experience. Novice qualitative researchers may benefit from this hands-on guide to content analysis. Practical tips and data analysis templates are provided to assist in the analysis process.

  6. Qualitative Content Analysis 101 (+ Examples) - Grad Coach

    What is content analysis? Content analysis is a qualitative analysis method that focuses on recorded human artefacts such as manuscripts, voice recordings and journals. Content analysis investigates these written, spoken and visual artefacts without explicitly extracting data from participants – this is called unobtrusive research.

  7. How to do a content analysis - Paperpile

    In research, content analysis is the process of analyzing content and its features with the aim of identifying patterns and the presence of words, themes, and concepts within the content.

  8. Sage Research Methods - Content Analysis: An Introduction to ...

    Organized into three parts, Content Analysis first examines the conceptual aspects of content analysis, then discusses components such as unitizing and sampling, and concludes by showing readers how to trace the analytical paths and apply evaluative techniques.

  9. Demystifying Content Analysis - PMC - National Center for ...

    This article will introduce and illustrate the process of content analysis as a way to analyze existing data, but also as an approach that may lead pharmacy educators to ask new types of research questions. Content analysis is a well-established data analysis method that has evolved in its treatment of textual data.

  10. Content Analysis - Methods, Types and Examples - Research Method

    Learn what content analysis is, how to conduct it, and what types and methods are available. See examples of content analysis in media, political, marketing, health, and education research.