{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"navigation\"></a>\n",
    "# Single-cell Hi-C data analysis\n",
    "\n",
    "Welcome to the second part of our analysis. Here we will work specifically with single-cell data. \n",
    "\n",
    "The outline:\n",
    "\n",
    "1. [Reads mapping](#mapping)\n",
    "2. [Data filtering and binning](#filtering)\n",
    "3. [Hi-C data visualisation](#visualisation)\n",
    "4. [Compartments and TADs](#meta)\n",
    "\n",
    "If you have any questions, please, contact Aleksandra Galitsyna (agalitzina@gmail.com)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"mapping\"></a>\n",
    "## 1. Reads mapping\n",
    "[Go top](#navigation)\n",
    "\n",
    "In this notebook, we will be working with three datasets from [Flyamer et al. 2017](http://www.nature.com/nature/journal/v544/n7648/full/nature21711.html?foxtrotcallback=true) (GEO ID [GSE80006](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE80006)) placed in DATA/FASTQ/ directory.\n",
    "Opposite to previous analysis, now you can work with single-cell Hi-C results. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "from hiclib import mapping\n",
    "from mirnylib import h5dict, genome"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "bowtie_path       = '/opt/conda/bin/bowtie2'\n",
    "\n",
    "enzyme = 'DpnII'\n",
    "\n",
    "bowtie_index_path = '/home/jovyan/GENOMES/HG19_IND/hg19_chr1'\n",
    "fasta_path        = '/home/jovyan/GENOMES/HG19_FASTA/'\n",
    "chrms             = ['1']\n",
    "\n",
    "genome_db    = genome.Genome(fasta_path, readChrms=chrms)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "min_seq_len       = 120\n",
    "len_step          = 5\n",
    "nthreads          = 2\n",
    "temp_dir          = 'tmp'\n",
    "bowtie_flags      = '--very-sensitive'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this case, we have multiple datasets, thus we have to iterate through the list of files. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "experiment_ids = ['72-sc-1', '54-sc-1', '58-sc-1']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "for exp_id in experiment_ids:\n",
    "    \n",
    "    infile1           = '/home/jovyan/DATA/FASTQ1/K562_{}_R1.fastq'.format(exp_id)\n",
    "    infile2           = '/home/jovyan/DATA/FASTQ1/K562_{}_R2.fastq'.format(exp_id)\n",
    "    out1              = '/home/jovyan/DATA/SAM/K562_{}_R1.chr1.sam'.format(exp_id)\n",
    "    out2              = '/home/jovyan/DATA/SAM/K562_{}_R2.chr1.sam'.format(exp_id)\n",
    "\n",
    "    mapping.iterative_mapping(\n",
    "        bowtie_path       = bowtie_path,\n",
    "        bowtie_index_path = bowtie_index_path,\n",
    "        fastq_path        = infile1,\n",
    "        out_sam_path      = out1,\n",
    "        min_seq_len       = min_seq_len,\n",
    "        len_step          = len_step,\n",
    "        nthreads          = nthreads,\n",
    "        temp_dir          = temp_dir, \n",
    "        bowtie_flags      = bowtie_flags)\n",
    "\n",
    "    mapping.iterative_mapping(\n",
    "        bowtie_path       = bowtie_path,\n",
    "        bowtie_index_path = bowtie_index_path,\n",
    "        fastq_path        = infile2,\n",
    "        out_sam_path      = out2,\n",
    "        min_seq_len       = min_seq_len,\n",
    "        len_step          = len_step,\n",
    "        nthreads          = nthreads,\n",
    "        temp_dir          = temp_dir, \n",
    "        bowtie_flags      = bowtie_flags)\n",
    "    \n",
    "    out = '/home/jovyan/DATA/HDF5/K562_{}.fragments.hdf5'\n",
    "\n",
    "    mapped_reads = h5dict.h5dict(out.format(exp_id))\n",
    "\n",
    "    mapping.parse_sam(\n",
    "        sam_basename1  =  out1,\n",
    "        sam_basename2  =  out2,\n",
    "        out_dict       =  mapped_reads,\n",
    "        genome_db      =  genome_db,\n",
    "        enzyme_name    =  enzyme, \n",
    "        save_seqs      =  False,\n",
    "        keep_ids       =  False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"filtering\"></a>\n",
    "## 2. Data filtering and binning\n",
    "[Go top](#navigation)\n",
    "\n",
    "In case of single cell analysis, the contact filtering is much more sophisticated. For example, standard PCR duplicates filter is replaced by \"filterLessThanDistance\". The code below is based on hiclib [single cell scripts for Flyamer et al. 2017](https://bitbucket.org/mirnylab/hiclib/src/85979ac69b55/examples/singleCellScripts/?at=default)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import h5py\n",
    "import numpy as np\n",
    "\n",
    "from hiclib import fragmentHiC\n",
    "from hiclib.fragmentHiC import HiCdataset as HiCdatasetorig\n",
    "from mirnylib.numutils import uniqueIndex\n",
    "\n",
    "class HiCdataset(HiCdatasetorig):\n",
    "    \"Modification of HiCDataset to include all filters\"\n",
    "\n",
    "    def filterLessThanDistance(self):\n",
    "        # This is the old function used to filter \"duplicates\".\n",
    "        #After the final submission of the manuscript, It was replaced by a better function that does the same,\n",
    "        #but at bp resolution, not 100 bp.\n",
    "        M = self.N\n",
    "        for i in range(5):\n",
    "            for j in range(5):\n",
    "                chrStrandID = 10000000 * 10000000 * (np.array(self.chrms1 * (self.strands1 + 1), dtype = np.int64) * 100 + self.chrms2 * (self.strands2 + 1))\n",
    "                print(len(np.unique(chrStrandID)))\n",
    "                posid = np.array((self.cuts1 + i * 100) // 500, dtype = np.int64) * 10000000 + (self.cuts2 + i * 100) // 500\n",
    "                N = self.N\n",
    "                self.maskFilter(uniqueIndex(posid + chrStrandID))\n",
    "                print(N, \"filtered to\", self.N)\n",
    "        self.metadata[\"321_quasiDuplicatesRemoved\"] = M - self.N\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "output = []\n",
    "\n",
    "for exp_id in experiment_ids:\n",
    "    \n",
    "    inp = '/home/jovyan/DATA/HDF5/K562_{}.fragments.hdf5'.format(exp_id)\n",
    "    \n",
    "    out = '/home/jovyan/DATA/HDF5/K562_{}.tmp.hdf5'.format(exp_id)\n",
    "    outstat = '/home/jovyan/DATA/HDF5/K562_{}.stat.txt'.format(exp_id)\n",
    "    \n",
    "    fragments = HiCdataset(\n",
    "        filename             = out,\n",
    "        genome               = genome_db,\n",
    "        maximumMoleculeLength= 500,\n",
    "        enzymeName           = 1000,\n",
    "        mode                 = 'w')\n",
    "\n",
    "    fragments.parseInputData(\n",
    "        dictLike=inp)\n",
    "\n",
    "    fragments.filterLessThanDistance() \n",
    "    fs = fragments.fragmentSum()\n",
    "    fragments.fragmentFilter(fs < 9)\n",
    "\n",
    "    output.append(list(fragments.metadata.items()))\n",
    "    \n",
    "    out_bin = '/home/jovyan/DATA/HDF5/K562_{}.binned_{}.hdf5'\n",
    "\n",
    "    res_kb = [100, 20]\n",
    "    \n",
    "    for res in res_kb:\n",
    "        print(res)\n",
    "        outmap = out_bin.format(exp_id, str(res)+'kb')\n",
    "\n",
    "        fragments.saveHeatmap(outmap, res*1000)\n",
    "    \n",
    "    del fragments"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "output"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"visualisation\"></a>\n",
    "## 4. Hi-C data visualisation\n",
    "[Go top](#navigation)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "from hiclib.binnedData import binnedDataAnalysis"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "res = 100\n",
    "data_hic = binnedDataAnalysis(resolution=res*1000, genome=genome_db)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "for exp_id in experiment_ids:\n",
    "    data_hic.simpleLoad('/home/jovyan/DATA/HDF5/K562_{}.binned_{}.hdf5'.format(exp_id, str(res)+'kb'), exp_id)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import matplotlib as mpl\n",
    "import matplotlib.pyplot as plt\n",
    "import seaborn as sns\n",
    "sns.set_style('ticks')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "plt.figure(figsize=[10,10])\n",
    "plt.imshow(data_hic.dataDict['54-sc-1'][200:500, 200:500], cmap='jet', interpolation='None')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Comparison of single-cell and bulk datasets"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "data_hic.simpleLoad('/home/jovyan/DATA/HDF5/K562_B-bulk.binned_{}.hdf5'.format(str(res)+'kb'),'bulk')\n",
    "data_hic.removeDiagonal()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "mtx1 = data_hic.dataDict['bulk']\n",
    "mtx2 = data_hic.dataDict['54-sc-1']\n",
    "mtx_tmp = np.triu(mtx1)/np.mean(mtx1) + np.tril(mtx2)/np.mean(mtx2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "plt.figure(figsize=[10,10])\n",
    "plt.imshow(mtx_tmp[200:500, 200:500], cmap='Blues', interpolation='None', vmax=900)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "mtx_merged = sum([data_hic.dataDict[exp_id] for exp_id in experiment_ids]) "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "mtx1 = data_hic.dataDict['bulk']\n",
    "mtx2 = mtx_merged\n",
    "mtx_tmp = np.triu(mtx1)/np.mean(mtx1) + np.tril(mtx2)/np.mean(mtx2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "plt.figure(figsize=[10,10])\n",
    "plt.imshow(mtx_tmp[200:500, 200:500], cmap='Blues', interpolation='None', vmax=800)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"meta\"></a>\n",
    "## 5. Compartmanets and TADs\n",
    "[Go top](#navigation)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "from matplotlib import gridspec"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "eig =  np.loadtxt('/home/jovyan/DATA/ANNOT/comp_K562_100Kb_chr1.tsv')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "bgn = 0\n",
    "end = 500\n",
    "\n",
    "fig = plt.figure(figsize=(10,10))\n",
    "\n",
    "gs = gridspec.GridSpec(2, 1, height_ratios=[20,2])#width_ratios=[2,20], \n",
    "gs.update(wspace=0.0, hspace=0.0)\n",
    "\n",
    "ax = plt.subplot(gs[0,0])\n",
    "\n",
    "ax.matshow(mtx_tmp[bgn:end, bgn:end], cmap='jet', origin='lower', aspect='auto')\n",
    "ax.set_xticks([])\n",
    "ax.set_yticks([])\n",
    "\n",
    "axl = plt.subplot(gs[1,0])#, sharey=ax)\n",
    "plt.plot(range(end-bgn), eig[bgn:end] )\n",
    "plt.xlim(0, end-bgn)\n",
    "plt.xlabel('Eigenvector values')\n",
    "\n",
    "ticks = range(bgn, end+1, 100)\n",
    "ticklabels = ['{} Kb'.format(x) for x in ticks]\n",
    "plt.xticks(ticks, ticklabels)\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "### Topologically associating domains (TADs)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import lavaburst"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "mtx = data_hic.dataDict['54-sc-1']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "good_bins = mtx.astype(bool).sum(axis=0) > 1 # We have to mask rows/cols if data is missing\n",
    "\n",
    "gammas=[0.1, 1.0, 5.0, 10.0] # set of parameters gamma for TADs calling\n",
    "\n",
    "segments_dict = {}\n",
    "\n",
    "for gam_current in gammas:\n",
    "    print(gam_current)\n",
    "    \n",
    "    S = lavaburst.scoring.armatus_score(mtx, gamma=gam_current, binmask=good_bins)\n",
    "    model = lavaburst.model.SegModel(S)\n",
    "    segments = model.optimal_segmentation() # Positions of TADs for input matrix\n",
    "\n",
    "    segments_dict[gam_current] = segments.copy()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# TADs at different parameters for particular cell (54-sc-1)\n",
    "\n",
    "A = mtx.copy()\n",
    "\n",
    "good_bins = A.astype(bool).sum(axis=0) > 0\n",
    "\n",
    "At = lavaburst.utils.tilt_heatmap(mtx, n_diags=100)\n",
    "\n",
    "start_tmp = 0\n",
    "end_tmp = 500\n",
    "\n",
    "f = plt.figure(figsize=(20, 6))\n",
    "\n",
    "ax = f.add_subplot(111)\n",
    "blues = sns.cubehelix_palette(0.4, gamma=0.5, rot=-0.3, dark=0.1, light=0.9, as_cmap=True)\n",
    "ax.matshow(np.log(At[start_tmp: end_tmp]), cmap=blues)\n",
    "\n",
    "cmap = mpl.cm.get_cmap('brg')\n",
    "\n",
    "gammas = segments_dict.keys()\n",
    "for n, gamma in enumerate(gammas):\n",
    "\n",
    "    segments = segments_dict[gamma]\n",
    "\n",
    "    for a in segments[:-1]:\n",
    "        if a[1]<start_tmp or a[0]>end_tmp:\n",
    "            continue\n",
    "        ax.plot([a[0]-start_tmp, a[0]+(a[1]-a[0])/2-start_tmp], [0, -(a[1]-a[0])], c=cmap(n/len(gammas)), alpha=0.5)\n",
    "        ax.plot([a[0]+(a[1]-a[0])/2-start_tmp, a[1]-start_tmp], [-(a[1]-a[0]), 0], c=cmap(n/len(gammas)), alpha=0.5)\n",
    "\n",
    "    a = segments[-1]\n",
    "    ax.plot([a[0]-start_tmp, a[0]+(a[1]-a[0])/2-start_tmp], [0, -(a[1]-a[0])], c=cmap(n/len(gammas)), alpha=0.5, label=gamma)\n",
    "    ax.plot([a[0]+(a[1]-a[0])/2-start_tmp, a[1]-start_tmp], [-(a[1]-a[0]), 0], c=cmap(n/len(gammas)), alpha=0.5)\n",
    "    \n",
    "ax.set_xlim([0,end_tmp-start_tmp])\n",
    "ax.set_ylim([100,-100])\n",
    "        \n",
    "ax.legend(bbox_to_anchor=(1.1, 1.05))\n",
    "ax.set_aspect(0.5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "The problem of the variable parameter for TADs calling might be resolved via parameter optimization. For example, the best parameter could be selected based on mean TADs size fitting expectations (~ 500 Kb in this case). "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "optimal_gammas = {}\n",
    "\n",
    "for exp_id in experiment_ids:\n",
    "    \n",
    "    mtx = data_hic.dataDict[exp_id][0:1000, 0:1000]\n",
    "    \n",
    "    good_bins = mtx.astype(bool).sum(axis=0) > 1 # We have to mask rows/cols if data is missing\n",
    "\n",
    "    gammas = np.arange(2, 24, 1)*1000/3250 # Desired set of gammas for testing\n",
    "\n",
    "    means = []\n",
    "    \n",
    "    for gam_current in gammas:\n",
    "\n",
    "        S = lavaburst.scoring.armatus_score(mtx, gamma=gam_current, binmask=good_bins)\n",
    "        model = lavaburst.model.SegModel(S)\n",
    "        segments = model.optimal_segmentation() # Positions of TADs for input matrix\n",
    "\n",
    "        tad_lens = segments[:,1]-segments[:,0]\n",
    "        good_lens = (tad_lens>=200/res)&(tad_lens<900) # We do not consider too large or too small segments as TADs\n",
    "        \n",
    "        means.append(np.mean(tad_lens[good_lens]))\n",
    "        \n",
    "    idx = np.argmin(np.abs(np.array(means)-500/res))\n",
    "    opt_mean, opt_gamma = means[idx], gammas[idx]\n",
    "    \n",
    "    print(exp_id, opt_mean*res, opt_gamma)\n",
    "    \n",
    "    optimal_gammas[exp_id] = opt_gamma"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# TADs in single cells compared with merged single-cell data\n",
    "\n",
    "A = mtx.copy()\n",
    "\n",
    "good_bins = A.astype(bool).sum(axis=0) > 0\n",
    "\n",
    "At = lavaburst.utils.tilt_heatmap(mtx_merged, n_diags=100)\n",
    "\n",
    "start_tmp = 0\n",
    "end_tmp = 500\n",
    "\n",
    "f = plt.figure(figsize=(20, 6))\n",
    "\n",
    "ax = f.add_subplot(111)\n",
    "ax.matshow(np.log(At[start_tmp: end_tmp]), cmap='Reds')\n",
    "\n",
    "for n, exp in enumerate(experiment_ids):\n",
    "    A = data_hic.dataDict[exp][bgn:end, bgn:end].copy()\n",
    "\n",
    "    good_bins = A.astype(bool).sum(axis=0) > 0\n",
    "\n",
    "    gamma = optimal_gammas[exp]\n",
    "    \n",
    "    S = lavaburst.scoring.modularity_score(A, gamma=gamma, binmask=good_bins)\n",
    "    model = lavaburst.model.SegModel(S)\n",
    "    segments = model.optimal_segmentation()\n",
    "\n",
    "    for a in segments[:-1]:\n",
    "        if a[1]<start_tmp or a[0]>end_tmp:\n",
    "            continue\n",
    "        tad_len = a[1]-a[0]\n",
    "        if (tad_len<200/res)|(tad_len>=900):\n",
    "            continue\n",
    "        ax.fill_between([a[0]-start_tmp, a[0]+(a[1]-a[0])/2-start_tmp, a[1]-start_tmp], [0, -(a[1]-a[0]), 0], 0, \n",
    "                        facecolor='#6100FF', interpolate=True, alpha=0.2)\n",
    "\n",
    "    a = segments[-1]\n",
    "    tad_len = a[1]-a[0]\n",
    "    if (tad_len<200/res)|(tad_len>=900):\n",
    "        continue\n",
    "    ax.fill_between([a[0]-start_tmp, a[0]+(a[1]-a[0])/2-start_tmp, a[1]-start_tmp], [0, -(a[1]-a[0]), 0], 0, \n",
    "                    facecolor='#6100FF', interpolate=True, alpha=0.2)\n",
    "\n",
    "ax.set_xlim([start_tmp,end_tmp])\n",
    "ax.set_ylim([100,-100])\n",
    "\n",
    "ax.set_aspect(0.5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# TADs in single cells compared with bulk Hi-C data\n",
    "\n",
    "A = mtx.copy()\n",
    "\n",
    "good_bins = A.astype(bool).sum(axis=0) > 0\n",
    "\n",
    "At = lavaburst.utils.tilt_heatmap(data_hic.dataDict['bulk'], n_diags=100)\n",
    "\n",
    "start_tmp = 0\n",
    "end_tmp = 300\n",
    "\n",
    "f = plt.figure(figsize=(20, 6))\n",
    "\n",
    "ax = f.add_subplot(111)\n",
    "ax.matshow(np.log(At[start_tmp: end_tmp]), cmap='Reds')\n",
    "\n",
    "for n, exp in enumerate(experiment_ids):\n",
    "    A = data_hic.dataDict[exp][bgn:end, bgn:end].copy()\n",
    "\n",
    "    good_bins = A.astype(bool).sum(axis=0) > 0\n",
    "\n",
    "    gamma = optimal_gammas[exp]\n",
    "    \n",
    "    S = lavaburst.scoring.modularity_score(A, gamma=gamma, binmask=good_bins)\n",
    "    model = lavaburst.model.SegModel(S)\n",
    "    segments = model.optimal_segmentation()\n",
    "\n",
    "    for a in segments[:-1]:\n",
    "        if a[1]<start_tmp or a[0]>end_tmp:\n",
    "            continue\n",
    "        tad_len = a[1]-a[0]\n",
    "        if (tad_len<200/res)|(tad_len>=900):\n",
    "            continue\n",
    "        ax.fill_between([a[0]-start_tmp, a[0]+(a[1]-a[0])/2-start_tmp, a[1]-start_tmp], [0, -(a[1]-a[0]), 0], 0, \n",
    "                        facecolor='#6100FF', interpolate=True, alpha=0.2)\n",
    "\n",
    "    a = segments[-1]\n",
    "    tad_len = a[1]-a[0]\n",
    "    if (tad_len<200/res)|(tad_len>=900):\n",
    "        continue\n",
    "    ax.fill_between([a[0]-start_tmp, a[0]+(a[1]-a[0])/2-start_tmp, a[1]-start_tmp], [0, -(a[1]-a[0]), 0], 0, \n",
    "                    facecolor='#6100FF', interpolate=True, alpha=0.2)\n",
    "\n",
    "ax.set_xlim([start_tmp,end_tmp])\n",
    "ax.set_ylim([100,-100])\n",
    "\n",
    "ax.set_aspect(0.5)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}