1. Preface
How to increase the maximum speed of the jettson Xavier on the pcie?
Because it is limited to 2.5 GT / s, Xavier seems to be able to increase to 8 GT/s.
Using Jetpack 4.5
0004:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad1 (rev a1) (prog-if 00 [Normal decode]) LnkCap: Port #0, Speed 8GT/s, Width x1, ASPM not supported, Exit Latency L0s <1us, L1 <64us LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt-
When no device is connected to nx
0004:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad1 (rev a1) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Interrupt: pin A routed to IRQ 33 Bus: primary=00, secondary=01, subordinate=ff, sec-latency=0 I/O behind bridge: 00001000-00001fff Memory behind bridge: 40000000-400fffff Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR- BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+ Address: 0000000000000000 Data: 0000 Masking: 00000000 Pending: 00000000 Capabilities: [70] Express (v2) Root Port (Slot-), MSI 00 DevCap: MaxPayload 256 bytes, PhantFunc 0 ExtTag- RBE+ DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+ RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ MaxPayload 128 bytes, MaxReadReq 512 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend- LnkCap: Port #0, Speed 8GT/s, Width x1, ASPM not supported, Exit Latency L0s <1us, L1 <64us ClockPM- Surprise+ LLActRep+ BwNot+ ASPMOptComp+ LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt+ AutBWInt- LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt- RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna+ CRSVisible+ RootCap: CRSVisible+ RootSta: PME ReqID 0000, PMEStatus- PMEPending- DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR+, OBFF Not Supported ARIFwd- DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled ARIFwd- LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1- EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest- Capabilities: [b0] MSI-X: Enable- Count=8 Masked- Vector table: BAR=2 offset=00000000 PBA: BAR=2 offset=00010000 Capabilities: [100 v2] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn- Capabilities: [148 v1] #19 Capabilities: [168 v1] #26 Capabilities: [18c v1] #27 Capabilities: [1ac v1] L1 PM Substates L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2- ASPM_L1.1- L1_PM_Substates+ PortCommonModeRestoreTime=60us PortTPowerOnTime=40us L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1- T_CommonMode=60us L1SubCtl2: T_PwrOn=60us Capabilities: [1bc v1] Vendor Specific Information: ID=0002 Rev=4 Len=100 <?> Capabilities: [2bc v1] Vendor Specific Information: ID=0001 Rev=1 Len=038 <?> Capabilities: [2f4 v1] #25 Capabilities: [300 v1] Precision Time Measurement PTMCap: Requester:+ Responder:+ Root:+ PTMClockGranularity: 16ns PTMControl: Enabled:- RootSelected:- PTMEffectiveGranularity: Unknown Capabilities: [30c v1] Vendor Specific Information: ID=0004 Rev=1 Len=054 <?> Kernel driver in use: pcieport
2. Query documents
Jetson Xavier actually has Gen-4 speed (i.e. 16 GT/s),
This is the default setting (that is, when a device with Gen-4 speed is connected, the link will appear at Gen-4 speed). Otherwise, the link speed depends on what is connected to the root port,
The final speed depends on the equipment end
You can use this script to change the speed pcie_set_speed.sh
#!/bin/bash dev=$1 speed=$2 if [ -z "$dev" ]; then echo "Error: no device specified" exit 1 fi if [ ! -e "/sys/bus/pci/devices/$dev" ]; then dev="0000:$dev" fi if [ ! -e "/sys/bus/pci/devices/$dev" ]; then echo "Error: device $dev not found" exit 1 fi pciec=$(setpci -s $dev CAP_EXP+02.W) pt=$((("0x$pciec" & 0xF0) >> 4)) port=$(basename $(dirname $(readlink "/sys/bus/pci/devices/$dev"))) if (($pt == 0)) || (($pt == 1)) || (($pt == 5)); then dev=$port fi lc=$(setpci -s $dev CAP_EXP+0c.L) ls=$(setpci -s $dev CAP_EXP+12.W) max_speed=$(("0x$lc" & 0xF)) echo "Link capabilities:" $lc echo "Max link speed:" $max_speed echo "Link status:" $ls echo "Current link speed:" $(("0x$ls" & 0xF)) if [ -z "$speed" ]; then speed=$max_speed fi if (($speed > $max_speed)); then speed=$max_speed fi echo "Configuring $dev..." lc2=$(setpci -s $dev CAP_EXP+30.L) echo "Original link control 2:" $lc2 echo "Original link target speed:" $(("0x$lc2" & 0xF)) lc2n=$(printf "%08x" $((("0x$lc2" & 0xFFFFFFF0) | $speed))) echo "New target link speed:" $speed echo "New link control 2:" $lc2n setpci -s $dev CAP_EXP+30.L=$lc2n echo "Triggering link retraining..." lc=$(setpci -s $dev CAP_EXP+10.L) echo "Original link control:" $lc lcn=$(printf "%08x" $(("0x$lc" | 0x20))) echo "New link control:" $lcn setpci -s $dev CAP_EXP+10.L=$lcn sleep 0.1 ls=$(setpci -s $dev CAP_EXP+12.W) echo "Link status:" $ls echo "Current link speed:" $(("0x$ls" & 0xF))
Is there a deeper way to change the speed of pcie instead of executing this script every time?
3. Install an 8GT/s equipment
Without executing any script and without any connection, nothing can be negotiated, so the link speed remains at 2.5GT /.
If you install an 8GT/s device, you will see the corresponding speed adjustment.
This is a segment of NVMe device, running at 8GT/s x4
0005:01:00.0 Non-Volatile memory controller: Micron/Crucial Technology Device 540a (rev 01) (prog-if 02 [NVM Express]) Subsystem: Micron/Crucial Technology Device 540a Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Interrupt: pin A routed to IRQ 35 IOMMU group: 61 Region 0: Memory at 1f40000000 (64-bit, non-prefetchable) [size=16K] Capabilities: [80] Express (v2) Endpoint, MSI 00 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+ RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset- MaxPayload 256 bytes, MaxReadReq 512 bytes DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend- LnkCap: Port #1, Speed 8GT/s, Width x4, ASPM L1, Exit Latency L1 unlimited ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+ LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 8GT/s (ok), Width x4 (ok) TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
This is the bridge it connects
0005:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad0 (rev a1) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Interrupt: pin A routed to IRQ 35 IOMMU group: 60 Bus: primary=00, secondary=01, subordinate=ff, sec-latency=0 I/O behind bridge: 0000f000-00000fff [disabled] Memory behind bridge: 40000000-400fffff [size=1M] Prefetchable memory behind bridge: 00000000fff00000-00000000000fffff [disabled] Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR- BridgeCtl: Parity- SERR- NoISA- VGA- VGA16- MAbort- >Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+ Address: 0000000000000000 Data: 0000 Masking: 00000000 Pending: 00000000 Capabilities: [70] Express (v2) Root Port (Slot-), MSI 00 DevCap: MaxPayload 256 bytes, PhantFunc 0 ExtTag- RBE+ DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+ RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ MaxPayload 256 bytes, MaxReadReq 512 bytes DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend- LnkCap: Port #0, Speed 16GT/s, Width x8, ASPM not supported ClockPM- Surprise+ LLActRep+ BwNot+ ASPMOptComp+ LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt+ AutBWInt- LnkSta: Speed 8GT/s (downgraded), Width x4 (downgraded) TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt+
4. Mining program commissioning
When starting the mining program, check the memory space based on pcie.
If nothing has changed, you can't dig anything on xavier nx and get the log:
cuda-0 Using Pci Id : 00:00.0 Xavier (Compute 7.2) Memory : 2.5 GB
The process requires at least 4.2 GB to generate DAG.
If you change the speed of pcie and run the mining process, you get the following message:
cuda-0 Using Pci Id : 00:00.0 Xavier (Compute 7.2) Memory : 6.19 GB
The mining process ran successfully because it had enough memory to generate DAG this time.
Therefore, in one way or another, they are a link to pcie speed and can run the mining process on this card.
5. Adjust the equipment tree
There is a device tree named "NVIDIA, init speed"
You can try to add it to a pcie device by overwriting it with a device tree
pcie@14160000 { nvidia,init-speed = <3>; }; pcie@141a0000 { nvidia,init-speed = <4>; };
The method described involves creating a new dtb that is loaded into the kernel at boot time.
The easiest way to do this is to run PCIe automatically at startup_ set_ speed. SH script.
This can be easily done with system services
The save path is "/ etc/systemd/system/pcie_set_speed.service"
[Unit] Description=Set PCIe Speed [Service] Type=oneshot ExecStart=/root/pcie_set_speed.sh [Install] WantedBy=sysinit.target
Then PCI_ set_ speed. Copy the SH script to / root /, and make sure it is executable. Run now
$ sudo systemctl daemon-reload $ sudo systemctl enable pcie_set_speed $ sudo systemctl start pcie_set_speed
Configure ok