-
Notifications
You must be signed in to change notification settings - Fork 44
bigquery
Initially we released AmsterdamUMCdb as downloadable csv files that could be obtained by applying for access from Amsterdam Medical Data Science. However, many individuals and organisations were struggling with setting up an initial database environment that was performant enough for basic data exploration, let alone modelling. In addition, to allow temporarily access for courses and datathons, setting up on premise environments was cumbersome. For this reason, the most recent version of AmsterdamUMCdb is also available from Google BigQuery for authorised user with Google Colaboratory as the main coding environment. This removes the necessity of downloading AmsterdamUMCdb, setting up a database system and installing a coding environment.
Unless you have enrolled in a course or datathon that uses AmsterdamUMCdb, you will need to request access by going to the Amsterdam Medical Data Science website.
Important: Ensure you have a working Google account and verify that the e-mail address used when applying for access has been associated with this account. If you already have a Google account, you can add secondary e-mail addresses here, or alternatively create another Google account using the e-mail address used during application for access.
If you don't have any experience in using Jupyter notebooks and/or Python, it is recommended to familiarize yourself with the basics.
Open Colab with the getting_started notebook from the official AmsterdamUMCdb GitHub
repository:
In contrast to the downloadable version of AmsterdamUMCdb, we have moved unvalidated (device)
data (registeredby IS NULL
) from the numericitems
and listitems
table to separate
tables (numericitems_unvalidated
and listitems_unvalidated
respectively) to reduce query costs and memory
consumption when unvalidated device data is not needed. However, depending on the chosen solution, you may need to
combine the data from both tables.
The example Jupyter notebooks were developed using the PostgreSQL database system as the backend and thus the SQL code used in the notebooks use the PostgreSQL dialect to query the database. However, BigQuery uses either Google Standard SQL or Legacy SQL. While simple queries will run unmodified, more complex queries often require modifying the SQL code.
- General remarks
- Error: OSError: Project was not passed and could not be determined from the environment.
- ERROR: 403 Access Denied: Table amsterdamumcdb-data:ams102.admissions: User does not have permission to query table
- Error: The project your_id has not enabled BigQuery.
Please follow the instructions in the getting_started notebook
carefully:
. If you make a mistake, depending on the API used to access Google BigQuery, it may be easier to fix mistakes by
restarting from the beginning by selecting
Runtime
> Restart Runtime
(Ctrl+M .
).
Make sure to enter your personal or corporate Google Cloud Project ID into the provided text box AND run the cell to set
the PROJECT_ID
variable. The Project ID will be charged by Google depending on the amount of data processed.
Most likely you are not using the Google account associated with the e-mail address while applying for access. Verify that the e-mail address used when applying for access has been associated with this account. If you already have a Google account, you can add secondary e-mail addresses here, or alternatively create another Google account using the e-mail address used during registration. In addition, unless you have applied for access following the regular procedure on Amsterdam Medical Data Science, access will be revoked after the course or datathon.
Please make sure you are actually using the Google Project ID, and not the Project Number or Project Name. For more information, see Locate the Project ID
AmsterdamUMCdb - Freely Accessible ICU Database
version 1.0.2 March 2020
Copyright © 2003-2022 Laboratory for Critical Care Computational Intelligence - Amsterdam UMC - Amsterdam Medical Data Science