{"_id":"5b0e13ffc4664e0003c75c8b","category":{"_id":"5b0e13ffc4664e0003c75aad","project":"5b0e13ffc4664e0003c75a66","version":"5b0e13ffc4664e0003c75a67","__v":0,"sync":{"url":"","isSync":false},"reference":false,"createdAt":"2017-04-18T08:58:21.978Z","from_sync":false,"order":38,"slug":"data-cruncher","title":"DATA CRUNCHER"},"version":{"_id":"5b0e13ffc4664e0003c75a67","project":"5b0e13ffc4664e0003c75a66","__v":4,"createdAt":"2015-09-17T16:58:03.490Z","releaseDate":"2015-09-17T16:58:03.490Z","categories":["5b0e13ffc4664e0003c75a68","5b0e13ffc4664e0003c75a69","5b0e13ffc4664e0003c75a6a","5b0e13ffc4664e0003c75a6b","5b0e13ffc4664e0003c75a6c","5b0e13ffc4664e0003c75a6d","5b0e13ffc4664e0003c75a6e","5b0e13ffc4664e0003c75a6f","5b0e13ffc4664e0003c75a70","5b0e13ffc4664e0003c75a71","5b0e13ffc4664e0003c75a72","5b0e13ffc4664e0003c75a73","5b0e13ffc4664e0003c75a74","5b0e13ffc4664e0003c75a75","5b0e13ffc4664e0003c75a76","5b0e13ffc4664e0003c75a77","5b0e13ffc4664e0003c75a89","5b0e13ffc4664e0003c75a8a","5b0e13ffc4664e0003c75a9d","5b0e13ffc4664e0003c75a9e","5b0e13ffc4664e0003c75a9f","5b0e13ffc4664e0003c75aa0","5b0e13ffc4664e0003c75aa1","5b0e13ffc4664e0003c75aa2","5b0e13ffc4664e0003c75aa3","5b0e13ffc4664e0003c75aa4","5b0e13ffc4664e0003c75aa5","5b0e13ffc4664e0003c75aa6","5b0e13ffc4664e0003c75aa7","5b0e13ffc4664e0003c75aa8","5b0e13ffc4664e0003c75aa9","5b0e13ffc4664e0003c75aaa","5b0e13ffc4664e0003c75aab","5b0e13ffc4664e0003c75aac","5b0e13ffc4664e0003c75aad","5b0e13ffc4664e0003c75aae","5b0e13ffc4664e0003c75aaf","5b0e13ffc4664e0003c75ab2","5bb3374f4306ad0003eb18e7","5bbf3c5373e72a000318362b","5bc065567d1cb0000384c649","5cbf19a5f9181f0033fbb968"],"is_deprecated":false,"is_hidden":false,"is_beta":true,"is_stable":true,"codename":"","version_clean":"1.0.0","version":"1.0"},"project":"5b0e13ffc4664e0003c75a66","__v":0,"parentDoc":null,"user":"5767bc73bb15f40e00a28777","githubsync":"","metadata":{"title":"","description":"","image":[]},"updates":[],"next":{"pages":[],"description":""},"createdAt":"2017-04-18T08:59:31.582Z","link_external":false,"link_url":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":4,"body":"## Overview\n\nData Cruncher allows you to enter and execute Python, R or Julia code to perform further analyses on your data on the Platform. This page will explain how you can access Data Cruncher from a project on the Platform, set up an analysis and execute code within the analysis.\n\n### [ 1 ] Access Data Cruncher\n\n1. Open the desired project on the Platform.\nThis project should contain the data that you want to analyze further using Data Cruncher.\n2. From the project's dashboard, click the **Interactive Analysis** tab.\nThe list of available interactive analysis tools opens. \n3. On the **Data Cruncher** card click **Open**.\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/b427f41-cruncher_card.png\",\n        \"cruncher_card.png\",\n        293,\n        441,\n        \"#eeebec\"\n      ]\n    }\n  ]\n}\n[/block]\nThis takes you to the Data Cruncher home page. If you have previous analyses, they will be listed on this page.\n\n### [ 2 ] Create and set up your analysis\n1. In the top-right corner click **Create analysis**.\nThe **Create new analysis** wizard is displayed.\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/11b5d30-f4c-run-an-analysis-using-data-cruncher-2.png\",\n        \"f4c-run-an-analysis-using-data-cruncher-2.png\",\n        558,\n        398,\n        \"#e9ebec\"\n      ]\n    }\n  ]\n}\n[/block]\n2. On the first screen, name your analysis in the **Analysis name** field.\n3. Select **JupyerLab** or **RStudio** as the analysis environment.\n4. Click **Next**.\n5. Select the instance for the analysis.\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/1505173-f4c-run-an-analysis-using-data-cruncher-3.png\",\n        \"f4c-run-an-analysis-using-data-cruncher-3.png\",\n        559,\n        398,\n        \"#e6e9ea\"\n      ]\n    }\n  ]\n}\n[/block]\nThe **Instance type** list displays available instances along with their disk size, number of vCPUs and memory (shown in brackets). The default instance is **c3.2xlarge** that has **160 GB** of SSD storage, **8 vCPUs** and **15 GB** of RAM. \n\n<a name=\"instance-inactivity\" style=\"color: #747c84;\">**Suspend time**</a> is the period of analysis inactivity after which the instance is stopped automatically. Inactivity implies that:\n* No files have been modified or created in the analysis (in the `/sbgenomics/workspace` directory) (JupyterLab and RStudio).\n* There are no running kernels (JupyterLab).\n\nApart from stopping the instance, this also includes stopping the analysis and saving all analysis files and output files. Minimum suspend time is 15 minutes.\n\n6. Click **Start the analysis**.\nThe Platform will start acquiring an adequate instance for your analysis, which may take a few minutes. Analysis initialization goes through the following stages:\n    * **Allocating the instance for your analysis** - Obtain an instance from the cloud infrastructure provider.\n    * **Preparing the allocated instance** - Load the required software onto the instance.\n    * **Doing the final setup of the analysis environment** - Perform final settings and initialize the analysis environment.\n\nOnce an instance is ready, you will be notified.\n[block:callout]\n{\n  \"type\": \"info\",\n  \"body\": \"If you don't have execute permissions in the project where the analysis is being created, the button is labelled **Create the analysis**. This allows you to create the analysis in draft state with the defined settings, but not execute it.\"\n}\n[/block]\n### [ 3 ] Start the analysis\n\nOnce the Platform has acquired an instance for your analysis, you will be able to open the editor and run your analysis by clicking **Open in editor**.\n\nDepending on the selected editing environment, you are presented with the following options:\n\n### JupyterLab\n**Notebook** - select whether to create a **Python 2**,** Python 3**, **R** or **Julia** notebook. A notebook is the central element of a Data Cruncher analysis in JupyterLab, where you can enter your code, but also store equations, visualizations and explanatory text.\n\n**Console** - select any of the **Python 2**, **Python 3**, **R** or **Julia** options if you prefer to run your code directly in the console.\n\n**Other** - this section offers the following options:\n   * **Text Editor** - used to create any text-based file that you want to have or use during your analysis. For example, if you need to add a JSON file to your analysis files, you can select this option, enter or paste the JSON content and save the file with a .json extension.\n   * **Terminal** - a familiar way of interacting with the system by bringing the functionality of a Linux shell into the Data Cruncher analysis environment.\n\n### RStudio (beta)\nThe RStudio editing environment opens right away, giving you direct access to the R console. To enter your R code as an R script or notebook instead of typing it directly into the console, select **File** > **New File** from the main menu bar and choose the adequate file type.","excerpt":"","slug":"run-an-analysis-using-data-cruncher","type":"basic","title":"Run an analysis using Data Cruncher"}

Run an analysis using Data Cruncher


## Overview Data Cruncher allows you to enter and execute Python, R or Julia code to perform further analyses on your data on the Platform. This page will explain how you can access Data Cruncher from a project on the Platform, set up an analysis and execute code within the analysis. ### [ 1 ] Access Data Cruncher 1. Open the desired project on the Platform. This project should contain the data that you want to analyze further using Data Cruncher. 2. From the project's dashboard, click the **Interactive Analysis** tab. The list of available interactive analysis tools opens. 3. On the **Data Cruncher** card click **Open**. [block:image] { "images": [ { "image": [ "https://files.readme.io/b427f41-cruncher_card.png", "cruncher_card.png", 293, 441, "#eeebec" ] } ] } [/block] This takes you to the Data Cruncher home page. If you have previous analyses, they will be listed on this page. ### [ 2 ] Create and set up your analysis 1. In the top-right corner click **Create analysis**. The **Create new analysis** wizard is displayed. [block:image] { "images": [ { "image": [ "https://files.readme.io/11b5d30-f4c-run-an-analysis-using-data-cruncher-2.png", "f4c-run-an-analysis-using-data-cruncher-2.png", 558, 398, "#e9ebec" ] } ] } [/block] 2. On the first screen, name your analysis in the **Analysis name** field. 3. Select **JupyerLab** or **RStudio** as the analysis environment. 4. Click **Next**. 5. Select the instance for the analysis. [block:image] { "images": [ { "image": [ "https://files.readme.io/1505173-f4c-run-an-analysis-using-data-cruncher-3.png", "f4c-run-an-analysis-using-data-cruncher-3.png", 559, 398, "#e6e9ea" ] } ] } [/block] The **Instance type** list displays available instances along with their disk size, number of vCPUs and memory (shown in brackets). The default instance is **c3.2xlarge** that has **160 GB** of SSD storage, **8 vCPUs** and **15 GB** of RAM. <a name="instance-inactivity" style="color: #747c84;">**Suspend time**</a> is the period of analysis inactivity after which the instance is stopped automatically. Inactivity implies that: * No files have been modified or created in the analysis (in the `/sbgenomics/workspace` directory) (JupyterLab and RStudio). * There are no running kernels (JupyterLab). Apart from stopping the instance, this also includes stopping the analysis and saving all analysis files and output files. Minimum suspend time is 15 minutes. 6. Click **Start the analysis**. The Platform will start acquiring an adequate instance for your analysis, which may take a few minutes. Analysis initialization goes through the following stages: * **Allocating the instance for your analysis** - Obtain an instance from the cloud infrastructure provider. * **Preparing the allocated instance** - Load the required software onto the instance. * **Doing the final setup of the analysis environment** - Perform final settings and initialize the analysis environment. Once an instance is ready, you will be notified. [block:callout] { "type": "info", "body": "If you don't have execute permissions in the project where the analysis is being created, the button is labelled **Create the analysis**. This allows you to create the analysis in draft state with the defined settings, but not execute it." } [/block] ### [ 3 ] Start the analysis Once the Platform has acquired an instance for your analysis, you will be able to open the editor and run your analysis by clicking **Open in editor**. Depending on the selected editing environment, you are presented with the following options: ### JupyterLab **Notebook** - select whether to create a **Python 2**,** Python 3**, **R** or **Julia** notebook. A notebook is the central element of a Data Cruncher analysis in JupyterLab, where you can enter your code, but also store equations, visualizations and explanatory text. **Console** - select any of the **Python 2**, **Python 3**, **R** or **Julia** options if you prefer to run your code directly in the console. **Other** - this section offers the following options: * **Text Editor** - used to create any text-based file that you want to have or use during your analysis. For example, if you need to add a JSON file to your analysis files, you can select this option, enter or paste the JSON content and save the file with a .json extension. * **Terminal** - a familiar way of interacting with the system by bringing the functionality of a Linux shell into the Data Cruncher analysis environment. ### RStudio (beta) The RStudio editing environment opens right away, giving you direct access to the R console. To enter your R code as an R script or notebook instead of typing it directly into the console, select **File** > **New File** from the main menu bar and choose the adequate file type.