Gus Lipkin

2023-02-11

What is cipheR and why?

  • cipheR is a package for simple text ciphers.
    • A cipher is text that has been transformed by some method with the intention of hiding its meaning
  • I made it because I needed a flexible Caesar Cipher for Advent of Code

To install from CRAN

install.packages("cipheR")

What I’m hoping you learn

  • Some cool new text ciphers perfect for passing notes in class
  • Base R methods to deal with text
  • Writing speedy code with apply

What I’m hoping to get feedback on

  • Error handling (and if we have time, testing)
  • UX and how best to work with vector inputs
  • What kinds of code comments people expect in open source
  • Any spots where I can swap an apply for vectorization

Caesar Cipher

  • Description: This can be used to create (encrypt) and solve (decrypt) a Caesar cipher. The function does not differentiate between the two.
  • Differentiator: You can choose your dictionary to shift by.

Parameters

  • x A vector to be shifted
  • n The number of places to shift by. This can be either positive or negative. Zero returns x as it was given to the function.
  • preserve_spaces (Default: TRUE) A boolean describing if spaces should be preserved. This is helpful when working with sentences.
  • dict The dictionary used for shifting. This defaults to NULL in which case a dictionary is built from the sorted unique values of x.
  • preset A pre-made dictionary using ASCII codes from ascii-code.com. Note that delete is excluded as a character.
  • The options are to only use provided characters (NULL), alphanumeric, keyboard, letters, lowercase, uppercase. Characters are US English.

Examples

(e1 <- caesar("abcde", 1))
[1] "bcdea"
caesar(e1, -1)
[1] "abcde"
(e2 <- caesar("cipheR is a great R package!", -5))
[1] "Rae!sh ai r tgsrk h erRcrtsp"
caesar(e2, 5)
[1] "cipheR is a great R package!"
(e3 <- caesar("Isn't this fun?", 2, preserve_spaces = FALSE))
[1] "suth ' Inu'i?tf"
caesar(e3, -2, preserve_spaces = FALSE)
[1] "Isn't this fun?"

caesar.R

caesar <- function(x, n = 1, preserve_spaces = TRUE, dict = NULL, preset = NULL) {

  # Catching errors
  if (length(x) == 0) {
    stop("Please provide a vector of length greater than zero to shift.")
  } else if (!is.atomic(x) | !(is.character(x) | is.numeric(x))) {
    stop("x must be a numeric or character vector.")
  }

  if (!is.null(dict) & !is.null(preset)) {
    warning("Both a dict and a preset was provided, only the dict will be used.")
  }

  if (n == 0) {
    message("Shifting by zero doesn't do anything...")
    return(x)
  }
  
  if (!is.character(x)) { x <- as.character(x) }
  
  x <- strsplit(x, "")
  unlistX <- unlist(x)
  
  # Set the dictionary as needed
  if (is.null(dict) & is.null(preset)) {
    dict <- sort(unique(unlistX))
  } else if (!is.null(preset)) {
    if (preset == "alphanumeric")   { dict <- rawToChar(as.raw(c(48:57, 65:90, 97:122))) } 
    else if (preset == "keyboard")  { dict <- rawToChar(as.raw(32:126)) } 
    else if (preset == "letters")   { dict <- rawToChar(as.raw(c(65:90, 97:122))) } 
    else if (preset == "lowercase") { dict <- rawToChar(as.raw(c(97:122))) } 
    else if (preset == "uppercase") { dict <- rawToChar(as.raw(c(65:90))) } 
    else {
      stop("It looks like you may have a typo in your presets. 
            You can double-check the presets by running ?caesar in your console.")
    }
  }

  if (length(dict) == 1) { dict <- unlist(strsplit(dict, "")) }

  if (!all(unlistX %in% dict)) {
    stop("Not all values of x are in the character set. Please choose a different character set.")
  }
  
  x <- lapply(x, function(y) {
      # We need to preserve spaces as requested
      if (preserve_spaces) {
        isSpace <- which(dict == " ")
        if (length(isSpace > 0)) { dict <- dict[-which(dict == " ")] }
      }

      # Do the shifting
      y <- sapply(y, function(z) {
          if (preserve_spaces & z == " ") { return(" ") }
          hop <- which(z == dict) + n
          hop <- hop %% length(dict)
          if (hop == 0) { hop <- length(dict) }
          return(dict[hop])
        })
      y <- paste0(y, collapse = "")
      return(y)
    })

  x <- unlist(x, recursive = FALSE)

  return(x)
}

test-caesar.R

test_that("Shift 0 warns and works", {
  expect_message(caesar("abcde", 0))
  expect_equal(caesar("abcde", 0), "abcde")
})
test_that("Shift 1 works", {
  expect_equal(caesar("abcde", 1), "bcdea")
})
test_that("Shift -1 works", {
  expect_equal(caesar("abcde", -1), "eabcd")
})

test_that("Spaces are preserved", {
  expect_equal(caesar("a b c", 1), "b c a")
})
test_that("Spaces are not preserved", {
  expect_equal(caesar("a b c", 1, preserve_spaces = FALSE), "baca ")
})

test_that("Vector length is preserved", {
  expect_equal(caesar(c("a", "b", "c"), 1), c("b", "c", "a"))
})

test_that("When no dictionary is provided, x is used", {
  expect_equal(caesar("abcde", 1), "bcdea")
})

test_that("Cannot provide both a dict and a preset", {
  expect_warning(caesar("abcde", 1, dict = "abcde", preset = "letters"))
})

test_that("The alphanumeric preset works", {
  expect_equal(caesar("abc123", 1, preset = "alphanumeric"), "bcd234")
})
test_that("The keyboard preset works", {
  expect_equal(caesar("abc123!@#", 1, preset = "keyboard"), "bcd234\"A$")
})
test_that("The letters preset works", {
  expect_equal(caesar("abcABC", 1, preset = "letters"), "bcdBCD")
})
test_that("The lowercase preset works", {
  expect_equal(caesar("abcde", 1, preset = "lowercase"), "bcdef")
})
test_that("The uppercase preset works", {
  expect_equal(caesar("ABCDE", 1, preset = "uppercase"), "BCDEF")
})
test_that("A typo in the preset gives an error", {
  expect_error(caesar("abc", 1, preset = "abc"))
})

test_that("dict must contain all values in x", {
  expect_error(caesar("abc", 1, dict = "def"))
})

test_that("x cannot be empty", {
  expect_error(caesar(c(), 1))
})
test_that("x must be a vector", {
  expect_error(caesar(list("abc"), 1))
})
test_that("x must be a character or numeric vector", {
  expect_error(caesar(c(TRUE), 1))
})

Atbash Cipher

  • Description: This can be used to create (encrypt) and solve (decrypt) an Atbash Cipher. An Atbash Cipher swaps letters’ places in the alphabet. Thus, ‘a’ becomes ‘z’, ‘b’ becomes ‘y’, and so on. The function does not differentiate between the two.
  • An Atbash cipher and a Caesar cipher with a shift of thirteen are the same.
a b c d
z y x w

Parameter

  • x A vector to be encoded or decoded

Examples

(e1 <- atbash("abcde"))
[1] "zyxwv"
atbash(e1)
[1] "abcde"
(e2 <- atbash("cipheR is a great R package!"))
[1] "xrksvI rh z tivzg I kzxpztv!"
atbash(e2)
[1] "cipheR is a great R package!"
(e3 <- atbash("Isn't this fun?"))
[1] "Rhm'g gsrh ufm?"
atbash(e3)
[1] "Isn't this fun?"

atbash.R

atbash <- function(x) {

  if (length(x) == 0) {
    stop("Please provide a vector of length greater than zero.")
  } else if (!is.atomic(x) & !is.character(x)) {
    stop("x must be a character vector.")
  }

  x <- strsplit(x, "")

  x <- lapply(x, function(y) {
    y <- sapply(y, function(z) {
        if (z %in% c(letters, LETTERS)) {
          if (grepl("[A-Z]", z)) { z <- LETTERS[27 - which(LETTERS == z)] } 
          else { z <- letters[27 - which(letters == z)] }
        }
        return(z)
      })
    y <- paste0(y, collapse = "")
    return(y)
  })

  x <- unlist(x, recursive = FALSE)

  return(x)
}

test-atbash.R

test_that("Letters are flipped", {
  expect_equal(atbash("az"), "za")
})

test_that("Spaces are preserved", {
  expect_equal(atbash("a b c"), "z y x")
})
test_that("Punctuation is preserved", {
  expect_equal(atbash("a.b"), "z.y")
})

test_that("Vector length is preserved", {
  expect_equal(atbash(c("a", "b", "c")), c("z", "y", "x"))
})

test_that("Must be a character vector of length greater than zero", {
  expect_error(atbash(c(TRUE)))
  expect_error(atbash(c(123)))
  expect_error(atbash(c()))
})

Rail Fence Cipher

  • Description: This can be used to create (encrypt) and solve (decrypt) a Railfence Cipher. A Railfence Cipher maps each letter to a cosine wave of the specified height where each letter resides at an integer rail height.

CAN YOU READ ME

C - - - - - U - - - - - - -
- A - - - O - - - - D - M -
- - N - Y - - - R - A - - - E
- - - - - - - - E - - - - -

CU AO DMNYRAE E

Parameters

  • x A vector to be encoded or decoded
  • n (Default: 1) The width of the rail to be used. A width of one will have no effect.
  • decrypt (Default: FALSE) The default FALSE will encrypt while using TRUE will decrypt a given value of x.

Examples

(e1 <- railfence("abcde", 2))
[1] "acebd"
railfence(e1, 2, decrypt = TRUE)
[1] "abcde"
(e2 <- railfence("cipheR is a great R package!", 4))
[1] "c gRaiRi r  kgpesaetpceh aa!"
railfence(e2, 4, decrypt = TRUE)
[1] "cipheR is a great R package!"
(e3 <- railfence("Isn't this fun?", 3))
[1] "Itius' hsfnnt ?"
railfence(e3, 3, decrypt = TRUE)
[1] "Isn't this fun?"

railfence.R

railfence <- function(x, n = 1, decrypt = FALSE) {

  if (length(x) == 0) { stop("Please provide a vector of length greater than zero") } 
  else if (!is.atomic(x) & !is.character(x)) { stop("x must be a character vector.") }

  if (length(n) != 1 || n < 1 || n %% 1 != 0) { 
    stop("n must be a single integer greater than or equal to 1") 
  }

  if (!is.logical(decrypt)) { stop("decrypt must be logical") }

  if (!decrypt) { x <- .railfence_encrypt(x, n) } 
  else { x <- .railfence_decrypt(x, n) }

  return(x)
}

#' @keywords internal
.railfence_encrypt <- function(x, n) {
  x <- strsplit(x, "")

  x <- lapply(x, function(y) {
    nx <- length(y)
    z <- c(1, rep_len(c(2:n, (n - 1):1), nx - 1))
    y <- data.frame("x" = y, "pos" = z)
    y <- y[order(y$pos), ]
    y <- paste0(y$x, collapse = "")
    return(y)
  })

  x <- unlist(x, recursive = FALSE)

  return(x)
}

#' @keywords internal
.railfence_decrypt <- function(x, n) {
  x <-
    lapply(x, function(s) {
      s <- unlist(strsplit(s, ""))
      m <- matrix(NA, n, length(s))
      j <- 1
      dir <- 1
      for (i in 1:length(s)) {
        m[j, i] <- "*"

        if (dir == 1) {
          if (j == n) {
            dir <- -1
            j <- j - 1
          } else {
            j <- j + 1
          }
        } else {
          if (j == 1) {
            dir <- 1
            j <- j + 1
          } else {
            j <- j - 1
          }
        }
      }
      k <- 1
      for (j in 1:nrow(m)) {
        for (i in 1:ncol(m)) {
          if (!is.na(m[j, i])) {
            m[j, i] <- s[k]
            k <- k + 1
          }
        }
      }
      m <- as.vector(m)
      m <- paste(m[!is.na(m)], collapse = "")

      return(m)
    })

  x <- unlist(x, recursive = FALSE)

  return(x)
}

test-railfence.R

test_that("encryption works", {
  expect_equal(railfence("abc def ghij", 3), "adgb e hjcfi")
  expect_equal(railfence(c("abc", "def", "ghij"), 2), c("acb", "dfe", "gihj"))
})
test_that("decryption works", {
  expect_equal(railfence("adgb e hjcfi", 3, decrypt = TRUE), "abc def ghij")
  expect_equal(railfence(c("acb", "dfe", "gihj"), 2, decrypt = TRUE),
               c("abc", "def", "ghij"))
})

test_that("must be a vector with length > 0", {
  expect_error(railfence(c(), 3))
  expect_error(railfence(matrix(NA, 1, 1), 3))
})

test_that("n must be a single integer greater than or equal to 1", {
  expect_error(railfence("abc", 0))
  expect_error(railfence("abc", 1.5))
  expect_error(railfence("abc", 0:3))
})

test_that("decrypt must be logical", {
  expect_error(railfence("abc", 3, "true"))
})

Vigenere Cipher

  • Description: This can be used to create (encrypt) and solve (decrypt) a Vigenere Cipher. A Vigenere cipher uses a table of alphabetic Caesar shifts for one to twenty-six. Each letter and corresponding key value determine the grid location to choose the obfuscated letter from.

Parameters

  • x A vector to be encoded or decoded.
  • key A character vector of length one to use as a key
  • decrypt (Default: FALSE) The default FALSE will encrypt while using TRUE will decrypt a given value of x.
  • keep_punctuation (Default: FALSE) The default FALSE will ignore case and punctuation and return a lowercase result. TRUE will match the input’s case and punctuation.

Examples

(e1 <- vigenere("abcde", "key"))
[1] "kfani"
vigenere(e1, "key", decrypt = TRUE)
[1] "abcde"
(e2 <- vigenere("cipheR is a great R package!", "key"))
[1] "mmnripswyqvckxpzeaueeo"
vigenere(e2, "key", decrypt = TRUE)
[1] "cipherisagreatrpackage"
(e3 <- vigenere("Isn't this fun?", "key", keep_punctuation = TRUE))
[1] "'"
[1] " "
[1] " "
[1] "?"
[1] "Swl'd xfsw der?"
vigenere(e3, "key", decrypt = TRUE, keep_punctuation = TRUE)
[1] "Isn't this fun?"

vigenere.R

vigenere <- function(x, key, decrypt = FALSE, keep_punctuation = FALSE) {

  if (length(x) == 0) {
    stop("Please provide a vector of length greater than zero for x")
  } else if (!is.atomic(x) & !is.character(x) & !is.matrix(x)) {
    stop("x must be a character vector.")
  }

  if (length(key) != 1) {
    stop("Please provide a vector of length one for key")
  } else if (!is.character(key) & (is.matrix(key) | !is.atomic(key))) {
    stop("key must be a character vector.")
  }

  if (!is.logical(decrypt)) {
    stop("decrypt must be logical")
  }
  if (!is.logical(keep_punctuation)) {
    stop("decrypt must be logical")
  }

  square <-
    suppressMessages(matrix(sapply(0:25, function(x) {
      caesar(letters, x)
    }), 26, 26))

  key <- tolower(key)

  if (!decrypt) {
    x <- .vigenere_encrypt(x, key, square, keep_punctuation)
  } else {
    x <- .vigenere_decrypt(x, key, square, keep_punctuation)
  }

  return(x)
}

#' @keywords internal
.vigenere_encrypt <- function(x, key, square, keep_punctuation) {
  x <- lapply(x, function(x) {
    x <- unlist(strsplit(x, ""))
    if (!keep_punctuation) { x <- x[grepl("[A-z]", x)] }
    lowerX <- tolower(x)

    key <- unlist(strsplit(key, ""))
    key <- .rep_key(x, key)
    r <- sapply(key, .get_letter)
    c <- sapply(lowerX, .get_letter)
    
    x <- sapply(1:length(x), function(i) {
      if (c[i] == 0) { y <- x[i] } 
      else { y <- square[r[i], c[i]] }
      if (keep_punctuation & grepl("[A-Z]", x[i])) { return(toupper(y)) }
      else { return(y) }
    })
    x <- paste0(x, collapse = "")
    return(x)
  })
  x <- unlist(x, recursive = FALSE)
  return(x)
}

#' @keywords internal
.vigenere_decrypt <- function(x, key, square, keep_punctuation) {
  x <- lapply(x, function(x) {
    x <- unlist(strsplit(x, ""))
    if (!keep_punctuation) { x <- x[grepl("[A-z]", x)] }
    lowerX <- tolower(x)

    key <- unlist(strsplit(key, ""))
    key <- .rep_key(x, key)
    r <- sapply(key, .get_letter)

    x <- mapply(function(r, x, lowerX) {
      i <- square[1, which(square[r, ] == lowerX)]
      if (keep_punctuation & grepl("[A-Z]", x)) { i <- toupper(i) }
      else if (keep_punctuation & length(i) == 0) { i <- x }
      return(i)
    }, r, x, lowerX)
    x <- paste0(x, collapse = "")
    return(x)
  })
  x <- unlist(x, recursive = FALSE)
  return(x)
}

#' @keywords internal
.get_letter <- function(y) {
  y <- which(letters == y)
  if (length(y) == 0) { return(0) }
  else { return(y) }
}

#' @keywords internal
.rep_key <- function(x, key) {
  rep_key <- c()
  k <- 1
  return_key <- sapply(1:length(x), function(i) {
    ki <- k %% length(key)
    if (ki == 0) { ki <- length(key) }

    if (!grepl("[A-z]", x[i])) {
      rep_key <- c(rep_key, "")
    } else {
      rep_key <- c(rep_key, key[ki])
      k <<- k + 1
    }
    return(rep_key)
  })

  return(return_key)
}

test-vigenere.R

test_that("Punctuation is preserved for encryption", {
  expect_equal(vigenere("Test.message", "key", keep_punctuation = TRUE), "Diqd.qccwyqi")
  expect_equal(vigenere("testmessage", "key", keep_punctuation = FALSE), "diqdqccwyqi")
})
test_that("Punctuation is not preserved for encryption", {
  expect_equal(vigenere("Test.message", "key", keep_punctuation = FALSE), "diqdqccwyqi")
  expect_equal(vigenere("testmessage", "key", keep_punctuation = FALSE), "diqdqccwyqi")
})

test_that("Punctuation is preserved for decryption", {
  expect_equal(vigenere("Diqd.qccwyqi", "key", decrypt = TRUE, keep_punctuation = TRUE), "Test.message")
  expect_equal(vigenere("diqdqccwyqi", "key", decrypt = TRUE, keep_punctuation = TRUE), "testmessage")
})
test_that("Punctuation is not preserved for decryption", {
  expect_equal(vigenere("Diqd.qccwyqi", "key", decrypt = TRUE, keep_punctuation = FALSE), "testmessage")
  expect_equal(vigenere("diqdqccwyqi", "key", decrypt = TRUE, keep_punctuation = TRUE), "testmessage")
})

test_that("x must be a vector with length > 0", {
  expect_error(vigenere(c(), "key"))
  expect_error(vigenere(matrix(NA, 1, 1), "key"))
})
test_that("key must be a vector with length == 1", {
  expect_error(vigenere("testmessage", c()))
  expect_error(vigenere("testmessage", matrix(NA, 1, 1)))
})

test_that("decrypt must be a boolean", {
  expect_equal(vigenere("Test.message", "key", decrypt = FALSE, keep_punctuation = TRUE), "Diqd.qccwyqi")
  expect_equal(vigenere("Diqd.qccwyqi", "key", decrypt = TRUE, keep_punctuation = TRUE), "Test.message")
  expect_error(vigenere("testmessage", "key", decrypt = 1))
})

test_that("keep_punctuation must be a boolean", {
  expect_equal(vigenere("Test.message", "key", decrypt = FALSE, keep_punctuation = TRUE), "Diqd.qccwyqi")
  expect_equal(vigenere("Diqd.qccwyqi", "key", decrypt = TRUE, keep_punctuation = FALSE), "testmessage")
  expect_error(vigenere("testmessage", "key", keep_punctuation = 1))
})

Running Key Cipher

  • Description: This can be used to create (encrypt) and solve (decrypt) a Running Key Vigenere Cipher. A Vigenere cipher uses a table of alphabetic Caesar shifts for one to twenty-six. The key is made to have an equal length to the text by adding the first letters of the text to the key. Each letter and corresponding key value determine the grid location to choose the obfuscated letter from.
Plaintext R i s g r e a t
Running key a l l a g r e e a
Ciphertext R t d g x v e x

Note

Can we take a look at the description. Is there a more concise way to communicate this?

Parameters

These are the same as the parameters for vigenere() because running_key() is just a modified version.

  • x A vector to be encoded or decoded.
  • key A character vector of length one to use as a key
  • decrypt (Default: FALSE) The default FALSE will encrypt while using TRUE will decrypt a given value of x.
  • keep_punctuation (Default: FALSE) The default FALSE will ignore case and punctuation and return a lowercase result. TRUE will match the input’s case and punctuation.

Examples

key <- "thisismysupersecurekey"
(e1 <- running_key("abcde", key))
[1] "tikvm"
running_key(e1, key, decrypt = TRUE)
[1] "abcde"
(e2 <- running_key("cipheR is a great R package!", key))
[1] "vpxzmjuqsagirlvrutokkc"
running_key(e2, key, decrypt = TRUE)
[1] "cipherisagreatrpackage"
(e3 <- running_key("Isn't this fun?", key, keep_punctuation = TRUE))
[1] "'"
[1] " "
[1] " "
[1] "?"
[1] "Bzv'l bzuq xoc?"
running_key(e3, key, decrypt = TRUE, keep_punctuation = TRUE)
[1] "Isn't this fun?"

running_key.R

running_key <- function(x, key, decrypt = FALSE, keep_punctuation = FALSE) {
  if (length(x) != 1) { stop("Please provide a vector of length one for x") 
  } else if (!is.atomic(x) & !is.character(x) & !is.matrix(x)) {
    stop("x must be a character vector.")
  }

  if (length(key) != 1) { stop("Please provide a vector of length one for key") 
  } else if (!is.character(key) & (is.matrix(key) | !is.atomic(key))) {
    stop("key must be a character vector.")
  }

  y <- unlist(strsplit(x, ""))
  y <- y[grepl("[A-z]", y)]

  k <- unlist(strsplit(key, ""))
  k <- k[grepl("[A-z]", k)]

  if(length(k) >= length(y)) {
    x <- vigenere(x, paste0(k, collapse = ""), decrypt = decrypt, keep_punctuation = keep_punctuation)
    return(x)
  } else {
    stop(paste("The key must have an equal or greater number of letters to the text. The key has",
               length(k), "characters and the message has", length(y), "characters."))
  }
}

test-running_key.R

test_that("Encryption works", {
  expect_equal(running_key("testmessage", "thisisakeytothecipher"), "mlaluwsceex")
})

test_that("Decryption works", {
  expect_equal(running_key("mlaluwsceex", "thisisakeytothecipher", decrypt = TRUE), "testmessage")
})

test_that("Key must be longer than x", {
  expect_error(running_key("testmessage", "key"))
})

What I’m hoping to get feedback on

Do your worst! Or best, constructive criticism is always good.

  • Error handling (and if we have time, testing)
  • UX and how best to work with vector inputs
  • What kinds of code comments people expect in open source
  • Any spots where I can swap an apply for vectorization

Questions I have for you guys

  • What is the best way to handle moving validation into their own functions? Do they get their own R file and they’re all marked as internal?