CodelistGenerator options: examples with a with mock vocabulary

Mock vocabulary database

Let´s say we have a mock vocabulary database with these hypothetical concepts and relationships.

Search for exact keyword match

To find “Musculoskeletal disorder” we can search for that like so

codes <- getCandidateCodes(
  cdm = cdm,
  keywords = "Musculoskeletal disorder",
  domains = "Condition",
  includeDescendants = FALSE,
)

kable(codes)
concept_id concept_name domain_id concept_class_id vocabulary_id found_from
1 Musculoskeletal disorder condition clinical finding snomed From initial search

Note, we would also identify it based on a partial match

codes <- getCandidateCodes(
  cdm = cdm,
  keywords = "Musculoskeletal",
  domains = "Condition",
  includeDescendants = FALSE
)

kable(codes)
concept_id concept_name domain_id concept_class_id vocabulary_id found_from
1 Musculoskeletal disorder condition clinical finding snomed From initial search

Add descendants

To include descendants of an identified code, we can set includeDescendants to TRUE

kable(getCandidateCodes(
  cdm = cdm,
  keywords = "Musculoskeletal disorder",
  domains = "Condition",
  includeDescendants = TRUE
))
concept_id concept_name domain_id concept_class_id vocabulary_id found_from
1 Musculoskeletal disorder condition clinical finding snomed From initial search
2 Osteoarthrosis condition clinical finding snomed From descendants
3 Arthritis condition clinical finding snomed From descendants
4 Osteoarthritis of knee condition clinical finding snomed From descendants
5 Osteoarthritis of hip condition clinical finding snomed From descendants

Multiple search terms

We can also search for multiple keywords at the same time, and would have picked these all up with the following search

codes <- getCandidateCodes(
  cdm = cdm,
  keywords = c(
    "Musculoskeletal disorder",
    "arthritis",
    "arthrosis"
  ),
  domains = "Condition",
  includeDescendants = FALSE
)

kable(codes)
concept_id concept_name domain_id concept_class_id vocabulary_id found_from
1 Musculoskeletal disorder condition clinical finding snomed From initial search
3 Arthritis condition clinical finding snomed From initial search
4 Osteoarthritis of knee condition clinical finding snomed From initial search
5 Osteoarthritis of hip condition clinical finding snomed From initial search
2 Osteoarthrosis condition clinical finding snomed From initial search

Add ancestor

To include the ancestors one level above the identified concepts we can set includeAncestor to TRUE

codes <- getCandidateCodes(
  cdm = cdm,
  keywords = "Osteoarthritis of knee",
  includeAncestor = TRUE,
  domains = "Condition"
)

kable(codes)
concept_id concept_name domain_id concept_class_id vocabulary_id found_from
4 Osteoarthritis of knee condition clinical finding snomed From initial search
3 Arthritis condition clinical finding snomed From ancestor

Searches with multiple words

We can also find concepts with multiple words even if they are in a different order. For example, a search for “Knee osteoarthritis” will pick up “Osteoarthritis of knee”.

codes <- getCandidateCodes(
  cdm = cdm,
  keywords = "Knee osteoarthritis",
  domains = "Condition",
  includeDescendants = TRUE
)

kable(codes)
concept_id concept_name domain_id concept_class_id vocabulary_id found_from
4 Osteoarthritis of knee condition clinical finding snomed From initial search

With exclusions

We can also exclude specific terms

codes <- getCandidateCodes(
  cdm = cdm,
  keywords = "arthritis",
  exclude = "Hip osteoarthritis",
  domains = "Condition"
)

kable(codes)
concept_id concept_name domain_id concept_class_id vocabulary_id found_from
3 Arthritis condition clinical finding snomed From initial search
4 Osteoarthritis of knee condition clinical finding snomed From initial search

Search using synonyms

We can also pick up codes based on their synonyms. In this case “Arthritis” has a synonym of “Osteoarthrosis” and so a search of both the primary name of a concept and any of its associated synonyms would pick up this synonym and it would be included.

codes <- getCandidateCodes(
  cdm = cdm,
  keywords = "osteoarthrosis",
  domains = "Condition",
  searchInSynonyms = TRUE
)

kable(codes)
concept_id concept_name domain_id concept_class_id vocabulary_id found_from
2 Osteoarthrosis condition clinical finding snomed From initial search
3 Arthritis condition clinical finding snomed In synonyms
4 Osteoarthritis of knee condition clinical finding snomed From descendants
5 Osteoarthritis of hip condition clinical finding snomed From descendants

Or, in this case, we can get the same result by searching via synonyms. In this case when using searchViaSynonyms=TRUE, “Arthritis” (which gets identified first) has a synonym of “Osteoarthrosis”, and based on this synonym we can also include the “Osteoarthrosis” concept.

codes <- getCandidateCodes(
  cdm = cdm,
  keywords = "arthritis",
  domains = "Condition",
  searchViaSynonyms = TRUE
)

kable(codes)
concept_id concept_name domain_id concept_class_id vocabulary_id found_from
3 Arthritis condition clinical finding snomed From initial search
4 Osteoarthritis of knee condition clinical finding snomed From initial search
5 Osteoarthritis of hip condition clinical finding snomed From initial search
2 Osteoarthrosis condition clinical finding snomed Via synonyms

Fuzzy matches instead of only exact matches

We could have also picked up “Osteoarthrosis” by doing fuzzy matching which allows for some differences in spelling.

codes <- getCandidateCodes(
  cdm = cdm,
  keywords = "arthritis",
  domains = "Condition",
  fuzzyMatch = TRUE,
  maxDistanceCost = 0.2
)

kable(codes)
concept_id concept_name domain_id concept_class_id vocabulary_id found_from
2 Osteoarthrosis condition clinical finding snomed From initial search
3 Arthritis condition clinical finding snomed From initial search
4 Osteoarthritis of knee condition clinical finding snomed From initial search
5 Osteoarthritis of hip condition clinical finding snomed From initial search

Search via non-standard

Or we could have also picked up “Osteoarthrosis” by searching via non-standard.

codes <- getCandidateCodes(
  cdm = cdm,
  keywords = c("arthritis", "arthropathy"),
  domains = "Condition",
  searchNonStandard = TRUE
)

kable(codes)
concept_id concept_name domain_id concept_class_id vocabulary_id found_from
3 Arthritis condition clinical finding snomed From initial search
4 Osteoarthritis of knee condition clinical finding snomed From initial search
5 Osteoarthritis of hip condition clinical finding snomed From initial search
2 Osteoarthrosis condition clinical finding snomed From non-standard

Search for both standard and non-standard concepts

We can also include non-standard codes in our results like so

codes <- getCandidateCodes(
  cdm = cdm,
  keywords = c(
    "Musculoskeletal disorder",
    "arthritis",
    "arthropathy",
    "arthrosis"
  ),
  domains = "Condition",
  standardConcept = c("Standard", "Non-standard")
)

kable(codes)
concept_id concept_name domain_id concept_class_id vocabulary_id found_from
1 Musculoskeletal disorder condition clinical finding snomed From initial search
3 Arthritis condition clinical finding snomed From initial search
4 Osteoarthritis of knee condition clinical finding snomed From initial search
5 Osteoarthritis of hip condition clinical finding snomed From initial search
8 Knee osteoarthritis condition diagnosis read From initial search
17 Arthritis condition icd code icd10 From initial search
7 Degenerative arthropathy condition diagnosis read From initial search
2 Osteoarthrosis condition clinical finding snomed From initial search