Skip to content

Commit

Permalink
auto-generating sphinx docs
Browse files Browse the repository at this point in the history
  • Loading branch information
pytorchbot committed Feb 14, 2025
1 parent 64de895 commit 644b0af
Show file tree
Hide file tree
Showing 17 changed files with 156 additions and 125 deletions.
Binary file not shown.
Binary file not shown.
Binary file not shown.
1 change: 1 addition & 0 deletions main/_modules/torchao/dtypes/floatx/float8_layout.html
Original file line number Diff line number Diff line change
Expand Up @@ -662,6 +662,7 @@ <h1>Source code for torchao.dtypes.floatx.float8_layout</h1><div class="highligh
<span class="p">):</span>
<span class="w"> </span><span class="sd">&quot;&quot;&quot;Implements matmul between FP8 input and FP8 weight with compute using _scaled_mm&quot;&quot;&quot;</span>
<span class="n">scaled_mm_config</span> <span class="o">=</span> <span class="n">weight_tensor</span><span class="o">.</span><span class="n">_layout</span><span class="o">.</span><span class="n">mm_config</span>
<span class="k">assert</span> <span class="n">scaled_mm_config</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span>
<span class="n">out_shape</span> <span class="o">=</span> <span class="n">get_out_shape</span><span class="p">(</span><span class="n">input_tensor</span><span class="o">.</span><span class="n">shape</span><span class="p">,</span> <span class="n">weight_tensor</span><span class="o">.</span><span class="n">shape</span><span class="p">)</span>

<span class="c1"># Weight tensor preprocessing</span>
Expand Down
189 changes: 123 additions & 66 deletions main/_modules/torchao/quantization/quant_api.html

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,5 @@
fpx_weight_only
===============

.. autofunction:: fpx_weight_only
.. autoclass:: fpx_weight_only
:members:
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,5 @@
gemlite_uintx_weight_only
=========================

.. autofunction:: gemlite_uintx_weight_only
.. autoclass:: gemlite_uintx_weight_only
:members:
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,5 @@
uintx_weight_only
=================

.. autofunction:: uintx_weight_only
.. autoclass:: uintx_weight_only
:members:
10 changes: 5 additions & 5 deletions main/_sources/tutorials/template_tutorial.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -67,11 +67,11 @@ Example code (the output below is generated automatically):

.. code-block:: none
tensor([[0.3420, 0.9864, 0.6182],
[0.9509, 0.6658, 0.6808],
[0.7630, 0.3657, 0.5552],
[0.8138, 0.8566, 0.2858],
[0.1471, 0.9493, 0.6773]])
tensor([[0.5739, 0.8610, 0.0928],
[0.2888, 0.6691, 0.5624],
[0.5564, 0.6546, 0.8487],
[0.6128, 0.9871, 0.9530],
[0.2163, 0.6581, 0.6389]])
Expand Down
6 changes: 3 additions & 3 deletions main/api_ref_quantization.html
Original file line number Diff line number Diff line change
Expand Up @@ -443,10 +443,10 @@ <h2>Quantization APIs for <a href="#id1"><span class="problematic" id="id2">quan
<td><p>alias of <code class="xref py py-class docutils literal notranslate"><span class="pre">Int8DynamicActivationInt8WeightConfig</span></code></p></td>
</tr>
<tr class="row-odd"><td><p><a class="reference internal" href="generated/torchao.quantization.uintx_weight_only.html#torchao.quantization.uintx_weight_only" title="torchao.quantization.uintx_weight_only"><code class="xref py py-obj docutils literal notranslate"><span class="pre">uintx_weight_only</span></code></a></p></td>
<td><p>Applies uintx weight-only asymmetric per-group quantization to linear layers, using uintx quantization where x is the number of bits specified by <cite>dtype</cite></p></td>
<td><p>alias of <code class="xref py py-class docutils literal notranslate"><span class="pre">UIntXWeightOnlyConfig</span></code></p></td>
</tr>
<tr class="row-even"><td><p><a class="reference internal" href="generated/torchao.quantization.gemlite_uintx_weight_only.html#torchao.quantization.gemlite_uintx_weight_only" title="torchao.quantization.gemlite_uintx_weight_only"><code class="xref py py-obj docutils literal notranslate"><span class="pre">gemlite_uintx_weight_only</span></code></a></p></td>
<td><p>applies weight only 4 or 8 bit integer quantization and utilizes the gemlite triton kernel and its associated weight packing format.</p></td>
<td><p>alias of <code class="xref py py-class docutils literal notranslate"><span class="pre">GemliteUIntXWeightOnlyConfig</span></code></p></td>
</tr>
<tr class="row-odd"><td><p><a class="reference internal" href="generated/torchao.quantization.intx_quantization_aware_training.html#torchao.quantization.intx_quantization_aware_training" title="torchao.quantization.intx_quantization_aware_training"><code class="xref py py-obj docutils literal notranslate"><span class="pre">intx_quantization_aware_training</span></code></a></p></td>
<td><p>alias of <code class="xref py py-class docutils literal notranslate"><span class="pre">IntXQuantizationAwareTrainingConfig</span></code></p></td>
Expand All @@ -461,7 +461,7 @@ <h2>Quantization APIs for <a href="#id1"><span class="problematic" id="id2">quan
<td><p>alias of <code class="xref py py-class docutils literal notranslate"><span class="pre">Float8StaticActivationFloat8WeightConfig</span></code></p></td>
</tr>
<tr class="row-odd"><td><p><a class="reference internal" href="generated/torchao.quantization.fpx_weight_only.html#torchao.quantization.fpx_weight_only" title="torchao.quantization.fpx_weight_only"><code class="xref py py-obj docutils literal notranslate"><span class="pre">fpx_weight_only</span></code></a></p></td>
<td><p>Sub-byte floating point dtypes defined by <cite>ebits</cite>: exponent bits and <cite>mbits</cite>: mantissa bits e.g.</p></td>
<td><p>alias of <code class="xref py py-class docutils literal notranslate"><span class="pre">FPXWeightOnlyConfig</span></code></p></td>
</tr>
</tbody>
</table>
Expand Down
12 changes: 3 additions & 9 deletions main/generated/torchao.quantization.fpx_weight_only.html
Original file line number Diff line number Diff line change
Expand Up @@ -415,16 +415,10 @@

<section id="fpx-weight-only">
<h1>fpx_weight_only<a class="headerlink" href="#fpx-weight-only" title="Permalink to this heading"></a></h1>
<dl class="py function">
<dl class="py attribute">
<dt class="sig sig-object py" id="torchao.quantization.fpx_weight_only">
<span class="sig-prename descclassname"><span class="pre">torchao.quantization.</span></span><span class="sig-name descname"><span class="pre">fpx_weight_only</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">ebits</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><a class="reference external" href="https://docs.python.org/3/library/functions.html#int" title="(in Python v3.13)"><span class="pre">int</span></a></span></em>, <em class="sig-param"><span class="n"><span class="pre">mbits</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><a class="reference external" href="https://docs.python.org/3/library/functions.html#int" title="(in Python v3.13)"><span class="pre">int</span></a></span></em><span class="sig-paren">)</span><a class="reference internal" href="../_modules/torchao/quantization/quant_api.html#fpx_weight_only"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#torchao.quantization.fpx_weight_only" title="Permalink to this definition"></a></dt>
<dd><p>Sub-byte floating point dtypes defined by <cite>ebits</cite>: exponent bits and <cite>mbits</cite>: mantissa bits
e.g. fp6_e3_m2, fp6_e2_m3, …
The packing format and kernels are from the fp6-llm paper: <a class="reference external" href="https://arxiv.org/abs/2401.14112">https://arxiv.org/abs/2401.14112</a>
github repo: <a class="reference external" href="https://github.com/usyd-fsalab/fp6_llm">https://github.com/usyd-fsalab/fp6_llm</a>, now renamed to quant-llm
For more details for packing please see: <code class="xref py py-class docutils literal notranslate"><span class="pre">FpxTensorCoreAQTTensorImpl</span></code></p>
<p>This is experimental, will be merged with <cite>to_affine_quantized_floatx</cite>
in the future</p>
<span class="sig-prename descclassname"><span class="pre">torchao.quantization.</span></span><span class="sig-name descname"><span class="pre">fpx_weight_only</span></span><a class="headerlink" href="#torchao.quantization.fpx_weight_only" title="Permalink to this definition"></a></dt>
<dd><p>alias of <code class="xref py py-class docutils literal notranslate"><span class="pre">FPXWeightOnlyConfig</span></code></p>
</dd></dl>

</section>
Expand Down
18 changes: 3 additions & 15 deletions main/generated/torchao.quantization.gemlite_uintx_weight_only.html
Original file line number Diff line number Diff line change
Expand Up @@ -415,22 +415,10 @@

<section id="gemlite-uintx-weight-only">
<h1>gemlite_uintx_weight_only<a class="headerlink" href="#gemlite-uintx-weight-only" title="Permalink to this heading"></a></h1>
<dl class="py function">
<dl class="py attribute">
<dt class="sig sig-object py" id="torchao.quantization.gemlite_uintx_weight_only">
<span class="sig-prename descclassname"><span class="pre">torchao.quantization.</span></span><span class="sig-name descname"><span class="pre">gemlite_uintx_weight_only</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">group_size</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><a class="reference external" href="https://docs.python.org/3/library/typing.html#typing.Optional" title="(in Python v3.13)"><span class="pre">Optional</span></a><span class="p"><span class="pre">[</span></span><a class="reference external" href="https://docs.python.org/3/library/functions.html#int" title="(in Python v3.13)"><span class="pre">int</span></a><span class="p"><span class="pre">]</span></span></span><span class="w"> </span><span class="o"><span class="pre">=</span></span><span class="w"> </span><span class="default_value"><span class="pre">64</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">bit_width</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><a class="reference external" href="https://docs.python.org/3/library/functions.html#int" title="(in Python v3.13)"><span class="pre">int</span></a></span><span class="w"> </span><span class="o"><span class="pre">=</span></span><span class="w"> </span><span class="default_value"><span class="pre">4</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">packing_bitwidth</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><a class="reference external" href="https://docs.python.org/3/library/functions.html#int" title="(in Python v3.13)"><span class="pre">int</span></a></span><span class="w"> </span><span class="o"><span class="pre">=</span></span><span class="w"> </span><span class="default_value"><span class="pre">32</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">contiguous</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><a class="reference external" href="https://docs.python.org/3/library/typing.html#typing.Optional" title="(in Python v3.13)"><span class="pre">Optional</span></a><span class="p"><span class="pre">[</span></span><a class="reference external" href="https://docs.python.org/3/library/functions.html#bool" title="(in Python v3.13)"><span class="pre">bool</span></a><span class="p"><span class="pre">]</span></span></span><span class="w"> </span><span class="o"><span class="pre">=</span></span><span class="w"> </span><span class="default_value"><span class="pre">None</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="../_modules/torchao/quantization/quant_api.html#gemlite_uintx_weight_only"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#torchao.quantization.gemlite_uintx_weight_only" title="Permalink to this definition"></a></dt>
<dd><p>applies weight only 4 or 8 bit integer quantization and utilizes the gemlite triton kernel and its associated weight packing format.
This only works for fp16 models. 8 bit quantization is symmetric, 4 bit quantization is asymmetric.</p>
<dl class="field-list simple">
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
<dd class="field-odd"><ul class="simple">
<li><p><strong>group_size</strong> – parameter for quantization, controls the granularity of quantization, smaller
size is more fine grained</p></li>
<li><p><strong>bit_width</strong> – bit width of the quantized weight.</p></li>
<li><p><strong>packing_bitwidth</strong> – bit width of the packed weight, should be 8 or 32. Can have performance impacts depending on hardware.</p></li>
<li><p><strong>contiguous</strong> – if set, the weight will be packed as specified. Leaving it as None lets gemlite determine the best choice.</p></li>
</ul>
</dd>
</dl>
<span class="sig-prename descclassname"><span class="pre">torchao.quantization.</span></span><span class="sig-name descname"><span class="pre">gemlite_uintx_weight_only</span></span><a class="headerlink" href="#torchao.quantization.gemlite_uintx_weight_only" title="Permalink to this definition"></a></dt>
<dd><p>alias of <code class="xref py py-class docutils literal notranslate"><span class="pre">GemliteUIntXWeightOnlyConfig</span></code></p>
</dd></dl>

</section>
Expand Down
18 changes: 3 additions & 15 deletions main/generated/torchao.quantization.uintx_weight_only.html
Original file line number Diff line number Diff line change
Expand Up @@ -415,22 +415,10 @@

<section id="uintx-weight-only">
<h1>uintx_weight_only<a class="headerlink" href="#uintx-weight-only" title="Permalink to this heading"></a></h1>
<dl class="py function">
<dl class="py attribute">
<dt class="sig sig-object py" id="torchao.quantization.uintx_weight_only">
<span class="sig-prename descclassname"><span class="pre">torchao.quantization.</span></span><span class="sig-name descname"><span class="pre">uintx_weight_only</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">dtype</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">group_size</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">64</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">pack_dim</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">-</span> <span class="pre">1</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">use_hqq</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">False</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="../_modules/torchao/quantization/quant_api.html#uintx_weight_only"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#torchao.quantization.uintx_weight_only" title="Permalink to this definition"></a></dt>
<dd><p>Applies uintx weight-only asymmetric per-group quantization to linear layers, using uintx quantization where
x is the number of bits specified by <cite>dtype</cite></p>
<dl class="field-list simple">
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
<dd class="field-odd"><ul class="simple">
<li><p><strong>dtype</strong> – torch.uint1 to torch.uint7 sub byte dtypes</p></li>
<li><p><strong>group_size</strong> – parameter for quantization, controls the granularity of quantization, smaller
size is more fine grained, defaults to 64</p></li>
<li><p><strong>pack_dim</strong> – the dimension we use for packing, defaults to -1</p></li>
<li><p><strong>use_hqq</strong> – whether to use hqq algorithm or the default algorithm to quantize the weight</p></li>
</ul>
</dd>
</dl>
<span class="sig-prename descclassname"><span class="pre">torchao.quantization.</span></span><span class="sig-name descname"><span class="pre">uintx_weight_only</span></span><a class="headerlink" href="#torchao.quantization.uintx_weight_only" title="Permalink to this definition"></a></dt>
<dd><p>alias of <code class="xref py py-class docutils literal notranslate"><span class="pre">UIntXWeightOnlyConfig</span></code></p>
</dd></dl>

</section>
Expand Down
6 changes: 3 additions & 3 deletions main/genindex.html
Original file line number Diff line number Diff line change
Expand Up @@ -514,7 +514,7 @@ <h2 id="F">F</h2>
</li>
</ul></td>
<td style="width: 33%; vertical-align: top;"><ul>
<li><a href="generated/torchao.quantization.fpx_weight_only.html#torchao.quantization.fpx_weight_only">fpx_weight_only() (in module torchao.quantization)</a>
<li><a href="generated/torchao.quantization.fpx_weight_only.html#torchao.quantization.fpx_weight_only">fpx_weight_only (in module torchao.quantization)</a>
</li>
<li><a href="generated/torchao.dtypes.AffineQuantizedTensor.html#torchao.dtypes.AffineQuantizedTensor.from_hp_to_floatx">from_hp_to_floatx() (torchao.dtypes.AffineQuantizedTensor class method)</a>
</li>
Expand All @@ -536,7 +536,7 @@ <h2 id="F">F</h2>
<h2 id="G">G</h2>
<table style="width: 100%" class="indextable genindextable"><tr>
<td style="width: 33%; vertical-align: top;"><ul>
<li><a href="generated/torchao.quantization.gemlite_uintx_weight_only.html#torchao.quantization.gemlite_uintx_weight_only">gemlite_uintx_weight_only() (in module torchao.quantization)</a>
<li><a href="generated/torchao.quantization.gemlite_uintx_weight_only.html#torchao.quantization.gemlite_uintx_weight_only">gemlite_uintx_weight_only (in module torchao.quantization)</a>
</li>
</ul></td>
<td style="width: 33%; vertical-align: top;"><ul>
Expand Down Expand Up @@ -702,7 +702,7 @@ <h2 id="T">T</h2>
<h2 id="U">U</h2>
<table style="width: 100%" class="indextable genindextable"><tr>
<td style="width: 33%; vertical-align: top;"><ul>
<li><a href="generated/torchao.quantization.uintx_weight_only.html#torchao.quantization.uintx_weight_only">uintx_weight_only() (in module torchao.quantization)</a>
<li><a href="generated/torchao.quantization.uintx_weight_only.html#torchao.quantization.uintx_weight_only">uintx_weight_only (in module torchao.quantization)</a>
</li>
</ul></td>
<td style="width: 33%; vertical-align: top;"><ul>
Expand Down
Binary file modified main/objects.inv
Binary file not shown.
2 changes: 1 addition & 1 deletion main/searchindex.js

Large diffs are not rendered by default.

10 changes: 5 additions & 5 deletions main/tutorials/template_tutorial.html
Original file line number Diff line number Diff line change
Expand Up @@ -443,11 +443,11 @@ <h2>Steps<a class="headerlink" href="#steps" title="Permalink to this heading">
<span class="nb">print</span><span class="p">(</span><a href="https://pytorch.org/docs/stable/tensors.html#torch.Tensor" title="torch.Tensor" class="sphx-glr-backref-module-torch sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span class="n">x</span></a><span class="p">)</span>
</pre></div>
</div>
<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>tensor([[0.3420, 0.9864, 0.6182],
[0.9509, 0.6658, 0.6808],
[0.7630, 0.3657, 0.5552],
[0.8138, 0.8566, 0.2858],
[0.1471, 0.9493, 0.6773]])
<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>tensor([[0.5739, 0.8610, 0.0928],
[0.2888, 0.6691, 0.5624],
[0.5564, 0.6546, 0.8487],
[0.6128, 0.9871, 0.9530],
[0.2163, 0.6581, 0.6389]])
</pre></div>
</div>
</section>
Expand Down

0 comments on commit 644b0af

Please sign in to comment.