Work in progress |
> a common file format is essential for you and morphometrics
> .mom
files and <mom>
tibbles are the same object, presented differently
Since I started morphometrics on a daily basis, I have always been surprised that a common file format for morphometrics, to the best of my knowledge, does not exist.
Perhaps to my bigger surprise, it even does not seem to be really missing.
A common file format have direct benefits from the single user to the community scale: it is easy to share, archive, reuse, recycle, publish, etc.
In my humble opinion, morphometrics as a whole, and the questions it feeds, can hardly scale up without a common file format.
.mom
: yaml file and markup languageAn ideal file format for morphometrics would be:
Good news, this exists : yaml (todo) (and more generally…). yaml and structure language wikipedia’s pages are nice ones.
For general i/o operations with yaml, the eponym package by xx is just perfect.
<mom>
and .mom
A .mom
file is nothing but a yaml file with the .mom
extension. Actually, .mom
files are textual and strutured representation of a <mom>
tibble, no matter how small or complex it is.
Chivas looks like this in R, eg as a <mom>
tibble:
library(Momit) # shorten it for the sake of readability chivas <- chivas %>% Momocs2::coo_sample(4)
And like this in yaml, eg in a .mom
:
chivas_yaml <- chivas %>% yaml::as.yaml() # cat help visualize what would be written in a a .mom file chivas_yaml %>% cat() #> coo: #> chivas: #> x: #> - 33.0 #> - 141.0 #> - 329.0 #> - 238.0 #> 'y': #> - 437.0 #> - 44.0 #> - 201.0 #> - 668.0 #> type: beer #> fake: c #> size: -123.45 #> missing: .na
Now, let’s chivas_yaml
is read back into R:
chivas2 <- chivas_yaml %>% yaml::yaml.load() %>% tibble::as_tibble() chivas2 #> # A tibble: 1 x 5 #> coo type fake size missing #> <named list> <chr> <chr> <dbl> <lgl> #> 1 <named list [2]> beer c -123. NA
And, finally, let’s compare with the original tibble. We unsophisticate the mom for a tibble:
chivas %>% tibble::as_tibble() #> # A tibble: 1 x 5 #> coo type fake size missing #> <list<coo_single[,2]>> <fct> <fct> <dbl> <lgl> #> 1 <tibble [4 × 2]> beer c -123. NA
Not bad isn’t? If you dissecate the two prints, you will notice that three things were dropped:
mom_df
class (but we asked for it)coo_list
nature of the coo
column. Now it’s just a list
names
within this column