기업조회

본문 바로가기 주메뉴 바로가기

논문 기본정보

Non-negligible Occurrence of Errors in Gender Description in Public Data Sets

논문 개요

기관명, 저널명, ISSN, ISBN 으로 구성된 논문 개요 표입니다.
기관명 NDSL
저널명 Genomics informatics
ISSN 1598-866x,2234-0742
ISBN

논문저자 및 소속기관 정보

저자, 소속기관, 출판인, 간행물 번호, 발행연도, 초록, 원문UR, 첨부파일 순으로 구성된 논문저자 및 소속기관 정보표입니다
저자(한글) Kim, Jong Hwan,Park, Jong-Luyl,Kim, Seon-Young
저자(영문)
소속기관
소속기관(영문)
출판인
간행물 번호
발행연도 2016-01-01
초록 Due to advances in omics technologies, numerous genome-wide studies on human samples have been published, and most of the omics data with the associated clinical information are available in public repositories, such as Gene Expression Omnibus and ArrayExpress. While analyzing several public datasets, we observed that errors in gender information occur quite often in public datasets. When we analyzed the gender description and the methylation patterns of gender-specific probes (glucose-6-phosphate dehydrogenase [G6PD], ephrin-B1 [EFNB1], and testis specific protein, Y-linked 2 [TSPY2]) in 5,611 samples produced using Infinium 450K HumanMethylation arrays, we found that 19 samples from 7 datasets were erroneously described. We also analyzed 1,819 samples produced using the Affymetrix U133Plus2 array using several gender-specific genes (X (inactive)-specific transcript [XIST], eukaryotic translation initiation factor 1A, Y-linked [EIF1AY], and DEAD [Asp-Glu-Ala-Asp] box polypeptide 3, Y-linked [DDDX3Y]) and found that 40 samples from 3 datasets were erroneously described. We suggest that the users of public datasets should not expect that the data are error-free and, whenever possible, that they should check the consistency of the data.
원문URL http://click.ndsl.kr/servlet/OpenAPIDetailView?keyValue=03553784&target=NART&cn=JAKO201614138121960
첨부파일

추가정보

과학기술표준분류, ICT 기술분류,DDC 분류,주제어 (키워드) 순으로 구성된 추가정보표입니다
과학기술표준분류
ICT 기술분류
DDC 분류
주제어 (키워드) blood,DNA methylation,gender identity,gene expression,microarray analysis