Skip to content

Conversation

pull[bot]
Copy link

@pull pull bot commented Dec 12, 2021

See Commits and Changes for more details.


Created by pull[bot]

Can you help keep this open source service alive? 💖 Please sponsor : )

@pull pull bot added the ⤵️ pull label Dec 12, 2021
cfib and others added 29 commits August 21, 2024 12:36
* Gowin. Implement the UserFlash primitive

Some Gowin chips have embedded flash memory accessible from the fabric.
Here we add primitives that allow access to this memory.

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>

* Gowin. Fix cell creation

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>

---------

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Some Clocks PIPS were not created due to a check for the presence of a
delay class, now all wires are attributed to the class so that there is
no longer any need for this check.

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Add DHCEN primitive.

This primitive allows you to dynamically turn off and turn on the
networks of high-speed clocks.

This is done tracking the routes to the sinks and if the route passes
through a special HCLK MUX (this may be the input MUX or the output MUX,
as well as the interbank MUX), then the control signal of this MUX is
used.

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>

* Gowin. Change the DHCEN binding

Use the entire PIP instead of a wire - avoids normalisation and may also
be useful in the future when calculating clock stuff.

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>

---------

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Implement the EMCU primitive.

Add support for the GW1NSR-4C's embedded Cortex-M3 processor. Since it
uses flash in its own way, we disable additional flash processing for
this case.

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>

* Gowin. Fix merge.

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>

---------

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Fix truncation of output seed value from 64 bits to 32 bits (int
  instead of uint64) when written to json file.

* Fix input seed value conversion when --seed option is used.

* Remove input seed value scrambling (use of rngseed()) when --seed
  or --randomize-seed option is used since the output seed value will
  be the scrambled value and not the seed that was actually supplied
  or generated.
Signed-off-by: gatecat <gatecat@ds0.me>
timing: Add clock2clock delay as seperate
        timing line item.
timing_log: Use common segment type strings
timing: Fix max_delay_by_domain_pair function
timing: Fix hold time check
timing: Disable clock_skew analysis by default
Ravenslofty and others added 30 commits August 8, 2025 18:19
The GW-5A series has 8 flip-flops in a cell instead of 6. These
additional flip-flops can be used if the control network matches that
for the 4th and 5th DFFs in this cell.

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
With the release of Apicula 0.22, the GW5A series gained support for
simple IO, LUTs (including Widw LUTs), and DFFs (including flip-flops 6
and 7 specific to the GW5A series), so we can include the GW5A-25A among
Gowin devices.

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Timing

* clangformat

* Import some new data

* Import all timing data

* Add constants for needed timings

* Add separate file for delay handling

* wip

* Added helpers

* wip

* proper place for assignArchInfo

* wip

* wip

* Fixes for IO

* Add IOSEL delays

* Fix logic loops

* help figure out some ram paths

* return true only if exists

* cover all primitives

* Disable not used paths

* clockToQ

* Added some RAM timings

* Add more IOPATHs

* cleanup

* cleanup

* Map few more timings

* remove short name options

* support strings as options

* no need for return
The ALUs in the GW5A series have undergone changes compared to previous
chips.

The most significant change is the appearance of an input MUX for
carry — it is now possible to switch between VCC, GND, and COUT of the
previous ALU, as well as generate carry in logic.

The granularity of resource allocation for ALUs has also changed — it is
now possible to use each half of a slice independently for ALUs.

Not all new features are reflected in this commit:

  - since there is one CIN MUX for every six ALUs and it only works for
    ALUs with index 0, the new granularity is not very useful: the head of
    the chain can only be placed in the zero ALU. It is possible to gain one
    LUT by allocating ALUs in odd numbers, but we will leave that for the
    future.

  - using CIN MUX to generate carry in logic is interesting, but we have
    not yet been able to get the vendor IDE to generate such a
    configuration to figure out which wires are used, so for now we are
    leaving the old behavior in logic with the allocation of a specialized
    head ALU.

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
The LUTRAM mode is added to all supported chips at once.

This is essentially an alias for LUT4, so the packaging is also moved
before searching for LUT-DFF pairs for possible optimization.

In addition to being the only LUTRAM mode in the GW5A series, the
addition of ROM16 eliminates the need to manually rename the primitive
and its pins when working with files generated by Gowin IDE - a similar
situation occurred with INV, which is essentially LUT1.

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Optimize ALU wiring

Interestingly, although VCC and GND sources are present in each cell,
they cannot be connected directly to all LUT inputs. Instead, additional
PIPs are used.

A very simple ALU optimization: once we detect that one of the inputs is
a constant, we modify the main LUT that describes the ALU function so
that this primitive input is ignored, and then disconnect it from the
network, freeing up the PIP.
For example (unrealistic, since a real ALU LUT has a larger size and
service bits in the middle, etc.), the addition function of A and B when
A = 1 is converted from the general case (A isn't a constant and B isn't a
constant) to a special case:
0110 -> 0011

The renaming of ALU ports for ADD and SUB modes has also been
removed—this has already been done in the chip database as a fixed
change to the ALU LUT.

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>

* Gowin. Fix the style.

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>

---------

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
A programmable on-chip crystal oscillator has been implemented for the
GW5A series.

A critical innovation in this series was the change in the nature of the
OSC output pin—it now belongs to the clock wires, and therefore the
routes must be made with a special global router, as there is no
possibility of using routing through general-purpose PIPs.

At the same time, we are transferring the outputs of all previous
generations of OSC to potential clock wires. At the moment, this will
not affect the way they are routed - they will still end up as segments
as before, but in the future we may optimize the mechanism.

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Very rarely (about once a year), the dedicated clock router would
malfunction, issuing an incorrect route.

The reason turned out to be the so-called gate wires to the global clock
wire system from the logic. Among the PIPs for which these wires are
sinks, there are PIPs where the sources are also clock wires.

This leads to the possibility of feeding the clock signal back into the
gate and again into the global clock MUX.

If handled carelessly, this can lead to a complete loop.

But the loop option itself is particularly useful in the case of DCS
(dynamic clock selection) - the fact is that because these primitives
have four clock inputs and each of them could theoretically address all
56 clock sources, but in practice there are not enough wires and the DCS
inputs cannot serve as sinks for all clock sources.

The simplest solution (and the one that currently works) is to use the
gate to re-enter the clock system, but this time changing the clock
source.

This commit explicitly marks wires as gates and removes the possibility
of looping (however unlikely it may be) where a loop is not needed.

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* gatemate: Split BRAMs into halfs

* Cleanups

* move code arround

* optmize remapping halfs

* Name RAM cells

* fix cluster setting for cascade mode

* attach ECC pins

* rewire global clocks

* bump chip database version

* Fix KEEPER setting

* Fix conflict check

* cleanup
* Add bridge support

* Use bridge only if CPE is unused

* do not use CPE_MULT for MUX routing

* Fixed and documented

* delay for CPE_BRIDGE

* Convert bridge pips into bels

Co-authored-by: Miodrag Milanovic <mmicko@gmail.com>

* recursively reassign bridges

* reconnect cell ports to new nets

* handle inversion bits

* sort data in output for easier compare

* one to be removed after testing

* debug message

* Remove need for notifyPipChange

* use same logic for detecting bridge pips

* make sure that the pip used is the one assigned

* one wire may feed multiple ports

* remove #if

* clean up wire binding

* add debugging

* fix

* clangformat

* put back to error

* use tile instead of getting name out of bel/pip

* bump chipdb

* adressing review comments

* Addressed last one

---------

Co-authored-by: Lofty <dan.ravensloft@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.