The goal of this file is to handle a dataframe.

Environment

We start the notebook by loading packages of interest:

library(ggplot2) # for nice figures

.libPaths()
## [1] "/shared/ifbstor1/software/miniconda/envs/r-4.5.1/lib/R/library"

Data

We define directory to work in:

save_dir = "/shared/projects/2538_eb3i_n1_2025/atelier_scrnaseq/cours_intro_rmd/"

We create a dataframe:

my_data = data.frame(A = c(1,1,2,4,4,4,6),
                     B = c(2,2,1,3,4,5,6))
my_data
##   A B
## 1 1 2
## 2 1 2
## 3 2 1
## 4 4 3
## 5 4 4
## 6 4 5
## 7 6 6

Analysis

In this section, we explore the dataframe.

Description

What are the dimensions of the dataframe ?

dim(my_data)
## [1] 7 2

We make a descriptive summary of the data:

summary(my_data)
##        A               B        
##  Min.   :1.000   Min.   :1.000  
##  1st Qu.:1.500   1st Qu.:2.000  
##  Median :4.000   Median :3.000  
##  Mean   :3.143   Mean   :3.286  
##  3rd Qu.:4.000   3rd Qu.:4.500  
##  Max.   :6.000   Max.   :6.000

Visualization

We make a histogram to visualize the distribution of the column B. For this purpose, we use a default package always installed with R language.

hist(my_data$B)

We make a similar histogram using the ggplot2 package, enabling better visual aspect. This package works with layers, separated with a +.

ggplot(my_data, aes(x = B)) +
  geom_histogram()
## `stat_bin()` using `bins = 30`. Pick better value `binwidth`.

Save

In the end of the file, we save the object. Here, this is my_data.

We prepare output path to save file:

filename_to_save = paste0(save_dir, "my_data.csv")
filename_to_save
## [1] "/shared/projects/2538_eb3i_n1_2025/atelier_scrnaseq/cours_intro_rmd/my_data.csv"

We save the dataframe at this location:

write.csv(my_data, file = filename_to_save)

R Session

The following packages and their version were load:

sessionInfo()
## R version 4.5.1 (2025-06-13)
## Platform: x86_64-conda-linux-gnu
## Running under: Ubuntu 22.04.5 LTS
## 
## Matrix products: default
## BLAS/LAPACK: /shared/ifbstor1/software/miniconda/envs/r-4.5.1/lib/libopenblasp-r0.3.30.so;  LAPACK version 3.12.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: Europe/Paris
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] ggplot2_4.0.0
## 
## loaded via a namespace (and not attached):
##  [1] vctrs_0.6.5        cli_3.6.5          knitr_1.50         rlang_1.1.6       
##  [5] xfun_0.54          generics_0.1.4     S7_0.2.0           jsonlite_2.0.0    
##  [9] labeling_0.4.3     glue_1.8.0         htmltools_0.5.8.1  sass_0.4.10       
## [13] scales_1.4.0       rmarkdown_2.30     grid_4.5.1         tibble_3.3.0      
## [17] evaluate_1.0.5     jquerylib_0.1.4    fastmap_1.2.0      yaml_2.3.10       
## [21] lifecycle_1.0.4    compiler_4.5.1     dplyr_1.1.4        RColorBrewer_1.1-3
## [25] pkgconfig_2.0.3    rstudioapi_0.17.1  farver_2.1.2       digest_0.6.37     
## [29] R6_2.6.1           tidyselect_1.2.1   dichromat_2.0-0.1  pillar_1.11.1     
## [33] magrittr_2.0.4     bslib_0.9.0        withr_3.0.2        tools_4.5.1       
## [37] gtable_0.3.6       cachem_1.1.0
LS0tCnRpdGxlOiAiQmFzaWMgYW5hbHlzaXMiCmF1dGhvcjogIkVCQUlJIG4xIgpkYXRlOiAiYHIgZm9ybWF0KFN5cy50aW1lKCksICclWS0lbS0lZCcpYCIKb3V0cHV0OgogIGh0bWxfZG9jdW1lbnQ6CiAgICBjb2RlX2ZvbGRpbmc6IHNob3cKICAgIGNvZGVfZG93bmxvYWQ6IHRydWUKICAgIHRvYzogdHJ1ZQogICAgdG9jX2Zsb2F0OiB0cnVlCi0tLQoKVGhlIGdvYWwgb2YgdGhpcyBmaWxlIGlzIHRvIGhhbmRsZSBhIGRhdGFmcmFtZS4KCiMgRW52aXJvbm1lbnQKCldlIHN0YXJ0IHRoZSBub3RlYm9vayBieSBsb2FkaW5nIHBhY2thZ2VzIG9mIGludGVyZXN0OgoKYGBge3Igc2V0ZW52fQpsaWJyYXJ5KGdncGxvdDIpICMgZm9yIG5pY2UgZmlndXJlcwoKLmxpYlBhdGhzKCkKYGBgCgojIERhdGEKCldlIGRlZmluZSBkaXJlY3RvcnkgdG8gd29yayBpbjoKCmBgYHtyIHNhdmVfZGlyfQpzYXZlX2RpciA9ICIvc2hhcmVkL3Byb2plY3RzLzI1MzhfZWIzaV9uMV8yMDI1L2F0ZWxpZXJfc2NybmFzZXEvY291cnNfaW50cm9fcm1kLyIKYGBgCgpXZSBjcmVhdGUgYSBkYXRhZnJhbWU6CgpgYGB7ciBkZWZpbmVfZGF0YX0KbXlfZGF0YSA9IGRhdGEuZnJhbWUoQSA9IGMoMSwxLDIsNCw0LDQsNiksCiAgICAgICAgICAgICAgICAgICAgIEIgPSBjKDIsMiwxLDMsNCw1LDYpKQpteV9kYXRhCmBgYAoKIyBBbmFseXNpcwoKSW4gdGhpcyBzZWN0aW9uLCB3ZSBleHBsb3JlIHRoZSBkYXRhZnJhbWUuCgojIyBEZXNjcmlwdGlvbgoKV2hhdCBhcmUgdGhlIGRpbWVuc2lvbnMgb2YgdGhlIGRhdGFmcmFtZSA/CgpgYGB7ciBkaW1lbnNpb259CmRpbShteV9kYXRhKQpgYGAKCldlIG1ha2UgYSBkZXNjcmlwdGl2ZSBzdW1tYXJ5IG9mIHRoZSBkYXRhOgoKYGBge3Igc3VtbWFyeX0Kc3VtbWFyeShteV9kYXRhKQpgYGAKCiMjIFZpc3VhbGl6YXRpb24KCldlIG1ha2UgYSBoaXN0b2dyYW0gdG8gdmlzdWFsaXplIHRoZSBkaXN0cmlidXRpb24gb2YgdGhlIGNvbHVtbiBgQmAuIEZvciB0aGlzIHB1cnBvc2UsIHdlIHVzZSBhIGRlZmF1bHQgcGFja2FnZSBhbHdheXMgaW5zdGFsbGVkIHdpdGggUiBsYW5ndWFnZS4KCmBgYHtyIGhpc3RfdWdseX0KaGlzdChteV9kYXRhJEIpCmBgYAoKV2UgbWFrZSBhIHNpbWlsYXIgaGlzdG9ncmFtIHVzaW5nIHRoZSBgZ2dwbG90MmAgcGFja2FnZSwgZW5hYmxpbmcgYmV0dGVyIHZpc3VhbCBhc3BlY3QuIFRoaXMgcGFja2FnZSB3b3JrcyB3aXRoIGxheWVycywgc2VwYXJhdGVkIHdpdGggYSBgK2AuCgpgYGB7ciBoaXN0X2Jvd30KZ2dwbG90KG15X2RhdGEsIGFlcyh4ID0gQikpICsKICBnZW9tX2hpc3RvZ3JhbSgpCmBgYAoKIyBTYXZlCgpJbiB0aGUgZW5kIG9mIHRoZSBmaWxlLCB3ZSBzYXZlIHRoZSBvYmplY3QuIEhlcmUsIHRoaXMgaXMgYG15X2RhdGFgLgoKV2UgcHJlcGFyZSBvdXRwdXQgcGF0aCB0byBzYXZlIGZpbGU6CgpgYGB7ciBmaWxlbmFtZV90b19zYXZlfQpmaWxlbmFtZV90b19zYXZlID0gcGFzdGUwKHNhdmVfZGlyLCAibXlfZGF0YS5jc3YiKQpmaWxlbmFtZV90b19zYXZlCmBgYAoKV2Ugc2F2ZSB0aGUgZGF0YWZyYW1lIGF0IHRoaXMgbG9jYXRpb246CgpgYGB7ciBzYXZlX2RhdGF9CndyaXRlLmNzdihteV9kYXRhLCBmaWxlID0gZmlsZW5hbWVfdG9fc2F2ZSkKYGBgCgojIFIgU2Vzc2lvbgoKVGhlIGZvbGxvd2luZyBwYWNrYWdlcyBhbmQgdGhlaXIgdmVyc2lvbiB3ZXJlIGxvYWQ6CgpgYGB7ciBzZXNzaW9uX2luZm99CnNlc3Npb25JbmZvKCkKYGBgCg==