US Pat. No. 9,397,938

PACKET SCHEDULING IN A NETWORK PROCESSOR

Cavium, Inc., San Jose, ...

1. A circuit for managing transmittal of packets, the circuit comprising:
a packet descriptor manager (PDM) circuit module configured to generate a metapacket from a command signal, the metapacket
indicating a size and a destination of a packet to be transmitted by the circuit;

a packet scheduling engine (PSE) circuit module configured to model transmission of the packet through a model of a network
topology from the destination to the circuit, the model of the network topology including simulated instances of a plurality
of nodes in the network topology between the destination and the circuit and connections between the plurality of nodes, the
modeling being based on information indicated by the metapacket, the PSE determining an order in which to transmit the packet
among a plurality of packets based on the model transmission; and

a packet engines and buffering (PEB) circuit module configured to process the packet and cause the processed packet to be
transmitted toward the destination according to the order determined by the PSE.

US Pat. No. 9,509,362

METHOD AND APPARATUS FOR HANDLING MODIFIED CONSTELLATION MAPPING USING A SOFT DEMAPPER

Cavium, Inc., San Jose, ...

1. A method for processing data from a network, comprising:
obtaining, from a storage matrix, a first symbol received from a wireless communications network;
identifying the first symbol as being one of predefined encoding categories associated with control signals for managing data
transfer over the wireless communications network;

forcing at least one input value to infinity to discard unused constellation points associated with the first symbol in accordance
with a mapping rule;

performing a first minimum function based on the first symbol and the input value; and
generating a Log Likelihood Ratio (“LLR”) value representing a first logic value in response to output of the first minimum
function.

US Pat. No. 9,729,527

LOOKUP FRONT END PACKET INPUT PROCESSOR

Cavium, Inc., San Jose, ...

1. An apparatus comprising:
a memory storing a Rule Compiled Data Structure (RCDS), the RCDS representing a set of rules for packet classification, the
RCDS including a decision tree for selecting a subset of the set of rules for filtering a given packet, the decision tree
including a root node and a plurality of leaf nodes, each of the plurality of leaf nodes indicating one of the subset of the
set of rules, the root node indicating a starting address in the memory for the set of rules;

a host command interface, the host command interface configured to receive one or more host commands for an incremental update
for the RCDS;

a processor coupled to the memory and the host command interface, the processor configured to:
a) receive a key extracted from the packet, the key indicating which of the set of rules are applicable to the packet, and
b) perform an active search of the RCDS to: 1) apply the key to the decision tree to locate a matching leaf node out of the
plurality of leaf nodes, the matching leaf node indicating the subset of the set of rules, and 2) classify the given packet
based on the subset of the set of rules and independent of rules excluded from the subset of rules, the RCDS being updated
based on the one or more host commands received, the RCDS being updated after the active search is completed.

US Pat. No. 9,563,399

GENERATING A NON-DETERMINISTIC FINITE AUTOMATA (NFA) GRAPH FOR REGULAR EXPRESSION PATTERNS WITH ADVANCED FEATURES

Cavium, Inc., San Jose, ...

1. A method of compiling a pattern into a non-deterministic finite automata (NFA) graph, the method comprising:
examining the pattern for a plurality of elements and a plurality of node types, each node type corresponding with an element,
each element of the pattern to be matched at least zero times, the element representing a character, character class or string;

generating a plurality of nodes of the NFA graph, each node of the plurality of nodes configured to match with one of the
plurality of elements and store the node type corresponding to the element, a next node address in the NFA graph, a count
value, and the element, wherein the next node address and the count value are applicable as a function of the node type stored
and wherein the plurality of nodes generated enable a graph walk engine to identify the pattern in a payload with less nodes
relative to another NFA graph representing the pattern and employed by the graph walk engine to identify the pattern in the
payload.

US Pat. No. 9,152,494

METHOD AND APPARATUS FOR DATA PACKET INTEGRITY CHECKING IN A PROCESSOR

Cavium, Inc., San Jose, ...

1. A method of handling data packets within a processor, the method comprising:
intercepting, by a hardware packet integrity checking module, one or more data fields associated with a current segment of
a data packet being forwarded from a first hardware entity operating in a cut-through mode to one or more processing clusters,
at least one data field of the one or more data fields being indicative of an operation associated with the data;

checking, at the hardware packet integrity checking module, an integrity of the current segment of the data packet based on
the one or more data fields and parameters corresponding to the operation associated with the data packet to detect an integrity
error;

modifying at least one data field of the one or more data fields upon detecting the integrity error; and
forwarding the one or more data fields to the one or more processing clusters.

US Pat. No. 9,471,416

PARTITIONED ERROR CODE COMPUTATION

Cavium, Inc., San Jose, ...

1. A circuit for calculating an error code of a data message, comprising:
a first computation unit configured to receive a first word and generate a corresponding first error code;
a second computation unit configured to receive a second word and generate a corresponding second error code; and
a control circuit configured to 1) selectively accumulate the first and second error codes based on at least one control signal
to provide an accumulated error code, 2) selectively loop back the accumulated error code to an input of the control circuit
for further accumulation with a subsequent error code generated by at least one of the first and second computation units
to provide a final accumulated error code based on the at least one control signal, and 3) selectively output at least one
of the first error code, the second error code and the final accumulated error code based on the at least one control signal.

US Pat. No. 9,390,023

METHOD AND APPARATUS FOR CONDITIONAL STORING OF DATA USING A COMPARE-AND-SWAP BASED APPROACH

Cavium, Inc., San Jose, ...

1. A method of conditionally storing data, the method comprising:
initiating, by a core processor, an atomic sequence by executing a load operation designed to initiate the atomic sequence,
executing the load operation designed to initiate the atomic sequence includes loading content of a memory location and maintaining
an address indication of the memory location and a copy of the corresponding content loaded; and

performing a conditional storing operation, the conditional storing operation includes a compare-and-swap operation, the compare-and-swap
operation executed by a controller associated with a cache memory based on the address indication of the memory location and
the copy of the corresponding content maintained;

wherein the loaded content is loaded in a first cache memory and the cache memory with which the controller is associated
is a second memory cache, the method further comprising providing, by the core processor, the address indication of the memory
location and the copy of the corresponding content to the second cache memory.

US Pat. No. 9,059,836

WORD BOUNDARY LOCK

Cavium, Inc., San Jose, ...

1. A method comprising:
initializing an N bit register with initial content, where N is an integer greater than 1;
receiving a number of consecutive N bit words of an incoming data stream;
processing each of the number of consecutive N bit words by performing operations per bit position of the register including
performing a first logic operation on a corresponding received data bit and a next received data bit, performing a second
logic operation on a current state of the bit position of the register and a result of the first logic operation, and storing
a result of the second logic operation to update the state of the bit position of the register; and

defining a word boundary based on the content of the register following the processing of the number of consecutive N bit
words.

US Pat. No. 9,313,029

VIRTUALIZED NETWORK INTERFACE FOR REMOTE DIRECT MEMORY ACCESS OVER CONVERGED ETHERNET

CAVIUM, INC., San Jose, ...

6. An apparatus for generating an opaque data comprising a stream identifier, which identifies memory region and access controls
permitted to be accessed by data fields of a packet containing the stream identifier, the packet being formatted in accordance
with remote direct memory access over converged Ethernet, comprising:
an entity configured
to encrypt at least a part of the stream identifier with a first secret random data to provide an encrypted stream identifier,
to generate a digest by applying a cryptographic hash to the at least the part of the stream identifier, and
to combine the encrypted stream identifier with the digest to generate the opaque data, wherein the opaque data comprises
a remote key (R-Key) in accordance with specification implementing Infiniband specification.

US Pat. No. 9,054,729

SYSTEM AND METHOD OF COMPRESSION AND DECOMPRESSION

Cavium, Inc., San Jose, ...

1. A decompression engine comprising:
an ingress port configured to receive an input data stream and a decoder configuration parameter; and
a decoder configured to decode the data stream according to one of at least two Lempel Ziv based protocols based upon the
decoder configuration parameter.

US Pat. No. 9,391,892

METHOD AND APPARATUS FOR MANAGING TRANSPORT OPERATIONS TO A CLUSTER WITHIN A PROCESSOR

Cavium, Inc., San Jose, ...

1. A method of managing transport operations between a first memory cluster and one or more other memory clusters, the method
comprising:
receiving, in the first memory cluster, information related to one or more transport operations with related data buffered
in an interface device, the received information including information indicative of one or more types of the one or more
transport operations, the interface device coupling the first memory cluster to the one or more other memory clusters;

selecting, at an arbitrator having a plurality of first selectors and a second selector, at least one transport operation,
from the one or more transport operations, based at least in part on the received information by selecting a transport operation
among each type using the plurality of first selectors and selecting a transport operation among the transport operations
of different types provided by the plurality of first selectors, using the second selector; and

executing the selected at least one transport operation.

US Pat. No. 9,323,715

METHOD AND APPARATUS TO REPRESENT A PROCESSOR CONTEXT WITH FEWER BITS

Cavium, Inc., San Jose, ...

1. A method of handling processes within a processor device, the method comprising:
maintaining a translation table mapping uncompressed process context identifiers to corresponding compressed identifiers,
the uncompressed process context identifiers and the corresponding compressed identifiers being associated with address spaces
or corresponding computer processes; and

employing the compressed identifiers to probe one or more structures of the processor device in executing an operation;
wherein executing the operation comprises:
determining whether or not a current compressed identifier is valid;
upon determining that the current compressed identifier is valid, using the valid compressed identifier to probe one or more
structures of the processor device in executing the operation;

upon determining that the current compressed identifier is invalid, searching the translation table, based on an uncompressed
process context identifier associated with the operation, for one other compressed identifier mapped to the uncompressed process
context identifier.

US Pat. No. 9,490,968

CDR VOTER WITH IMPROVED FREQUENCY OFFSET TOLERANCE

Cavium, Inc., San Jose, ...

1. A circuit comprising:
a voter having one or more voter inputs, the voter generating, for each given voter input, an up vote indicative of a recovered
clock having a negative phase offset relative to the given voter input, or a down vote indicative of the recovered clock having
a positive phase offset relative to the given voter input;

a comparator coupled to the voter, the comparator configured to output a phase adjustment signal and a tie signal based upon
the up and down votes generated;

a shift register;
a multiplexer coupled to the comparator and the shift register, the multiplexer configured to select either the phase adjustment
signal or an output from the shift register as a multiplexer output, based on the tie signal; and

a flip-flop receiving the multiplexer output at a data input of the flip-flop, the flip-flop generating a phase adjustment
output signal, the shift register receiving the phase adjustment output signal directly from the flip-flop at a data input
of the shift register.

US Pat. No. 9,379,992

METHOD AND AN APPARATUS FOR VIRTUALIZATION OF A QUALITY-OF-SERVICE

CAVIUM, INC., San Jose, ...

13. An apparatus for virtualization of a quality of service, comprising:
a parser configured to associate a packet received at an interface with an aura via an aura identifier by evaluating information
of the internal fields of the packet's structure and information external to the packet's structure;

an aura management entity communicatively connected to the parser, the aura management entity being configured to determine
configuration parameters for the aura, comprising a parameter identifying maximum number of buffers that may be allocated
to the aura (AURA_CNT_LIMIT), a parameter identifying number of buffers allocated at a particular time (AURA_CNT), and a parameter
identifying aura level related to the pool of available buffers (AURA_POOL_LEVELS),

determine a pool of buffers for the aura,
to determine the state of the pool resources, the resources comprising a level of buffers available in the pool and a level
of buffers allocated to the aura, and

to determine a quality of service for the packet in accordance with the determined state of the pool and the configuration
parameters for the aura.

US Pat. No. 9,344,366

SYSTEM AND METHOD FOR RULE MATCHING IN A PROCESSOR

Cavium, Inc., San Jose, ...

1. A system comprising:
a format block configured to (a) receive a key including one or more bits from a packet, at least one rule for matching the
key, and rule formatting information, the at least one rule having at least one rule dimension, the at least one rule dimension
including a set of one or more bits from a corresponding rule of the at least one rule, and (b) extract each at least one
rule dimension from the at least one rule;

a plurality of dimension matching engines (DMEs), each DME, of the plurality of DMEs, coupled to the format block and configured
to receive the key and a corresponding formatted dimension, and process the key and the corresponding formatted dimension
for returning a match or nomatch; and

a post processing block configured to analyze the matches or no matches returned from the plurality of DMEs and return a response
based on the returned matches or nomatches.

US Pat. No. 9,225,643

LOOKUP CLUSTER COMPLEX

Cavium, Inc., San Jose, ...

1. A method comprising:
receiving a request associated with a packet, the request including an identifier (ID);
selecting at least one entry in a table indicated by the ID, the entry providing a starting address associated with a set
of rules;

determining a subset of rules based on at least one field of the request, the subset of rules being a portion of the set of
rules;

applying the at least one field against the subset of rules; and
outputting a response signal indicating whether the at least one field matches at least one rule of the subset of rules.

US Pat. No. 9,203,805

REVERSE NFA GENERATION AND PROCESSING

Cavium, Inc., San Jose, ...

1. A method comprising:
in a processor of a security appliance coupled to a network:
walking an input of a sequence of characters through a deterministic finite automata (DFA) graph generated for at least one
given regular expression pattern to enable inspection of packet content, the at least one given regular expression employed
to detect a security breach or an intrusion; and

at a marked node of the DFA graph, the marked node being a node that marks a match of the at least one given regular expression
pattern:

based on a specific type of the at least one given regular expression pattern matching at the marked node, walking the input
sequence of characters through a reverse non-deterministic finite automata (rNFA) graph by walking the input sequence of characters
backwards through the rNFA graph beginning from an offset of the input sequence of characters associated with the marked node,
the rNFA graph generated for the at least one given regular expression pattern and having at least one processing node inserted
therein, the at least one processing node inserted into the rNFA graph based on the specific type of the at least one regular
expression pattern; and
based on the specific type of the at least one given regular expression pattern not matching at the marked node, reporting
the match of the at least one given regular expression pattern.

US Pat. No. 9,112,767

METHOD AND AN ACCUMULATOR SCOREBOARD FOR OUT-OF-ORDER RULE RESPONSE HANDLING

Cavium, Inc., San Jose, ...

1. A method of managing bundles of rule matching threads processed by one or more rule matching engines in a search processor,
the method comprising:
recording, for each rule matching thread in a given bundle of rule matching threads, a rule matching result in association
with a priority corresponding to the respective rule matching thread;

determining a final rule matching result, for the given bundle of rule matching threads, based at least in part on the priorities
corresponding to the rule matching threads in the given bundle; and

generating a response state indicative of the determined final rule matching result for reporting to a host processor.

US Pat. No. 9,438,703

METHOD OF FORMING A HASH INPUT FROM PACKET CONTENTS AND AN APPARATUS THEREOF

Cavium, Inc., San Jose, ...

1. A method of implementing a parser engine, the method comprising:
identifying one or more protocol layers of a packet, wherein each of the protocol layers have one or more fields;
for each protocol layer of the protocol layers, expanding the protocol layer to a generic format based on the identification
of the protocol layer thereby forming an expanded protocol layer;

selecting one or more of a set of generic hash commands based on a layer type of the expanded protocol layer; and
selecting contents from each of the expanded protocol layers to apply a hash function by applying the one or more of the set
of generic hash commands to the expanded protocol layer.

US Pat. No. 9,398,033

REGULAR EXPRESSION PROCESSING AUTOMATON

Cavium, Inc., San Jose, ...

1. A method comprising:
in a processor of a security appliance coupled to a network:
generating an initial NFA for a set of patterns;
generating an initial DFA for the set of patterns, the initial DFA generated having states representing a subset of the set
of patterns and having at least one end-state, where each end-state maps to one or more states of the initial NFA for the
set of patterns and represents a transition from processing the set of patterns as DFA to processing the set of patterns as
NFA;

adding states to the initial DFA, extending from the at least one end-state, to form an extended DFA for the set of patterns;
mapping at least one of the added DFA states to one or more states of the NFA to form an end-state of the extended DFA, which
when processed, transitions run time processing for finding the set of patterns in an input stream from DFA to NFA so that
the portion of the set of patterns is found using DFA while a remaining portion of the set of patterns is found using the
NFA; and

in an event at least one pattern of the set of patterns has an overlapping pattern prefix with at least one other pattern
of the set of patterns, the method further comprising:

stripping from the at least one pattern the overlapping pattern prefix to form a stripped portion and stripped pattern based
on at least one stripping criteria and dropping the at least one pattern from the set of patterns to find in the input stream
in an event the at least one pattern in its entirety falls within any one of the at least one stripping criteria; and

generating a reverse NFA for the stripped portion, wherein generating the initial DFA includes generating an initial DFA for
the stripped pattern and the at least one other pattern and adding states to the initial DFA includes adding states to the
initial DFA to form an extended DFA for the stripped pattern and the at least one other pattern.

US Pat. No. 9,330,002

MULTI-CORE INTERCONNECT IN A NETWORK PROCESSOR

Cavium, Inc., San Jose, ...

1. A computer system on a computer chip comprising:
an interconnect circuit;
a plurality of memory buses, each bus connecting a respective group of plural processor cores to the interconnect circuit;
and

a cache divided into a plurality of banks, each bank being connected to the interconnect circuit via an individual bus;
the interconnect circuit configured to distribute a plurality of requests received from the plural processor cores among the
plurality of banks, wherein the interconnect circuit transforms the requests by modifying an address component of the requests.

US Pat. No. 9,413,357

HIERARCHICAL STATISTICALLY MULTIPLEXED COUNTERS AND A METHOD THEREOF

Cavium, Inc., San Jose, ...

1. A counter architecture implemented in a network device, the counter architecture comprising a plurality of levels of statistically
multiplexed counters, wherein each of the levels of statistically multiplexed counters includes N counters arranged in N/P
rows, wherein each of the N/P rows includes P base counters and S subcounters, wherein any of the P base counters configured
to be dynamically concatenated with one or more of the S subcounters to flexibly extend the counting capacity.

US Pat. No. 9,319,316

METHOD AND APPARATUS FOR MANAGING TRANSFER OF TRANSPORT OPERATIONS FROM A CLUSTER IN A PROCESSOR

Cavium, Inc., San Jose, ...

1. A method comprising:
selecting, at a clock cycle in a first memory cluster, at least one transport operation destined to at least one destination
memory cluster, from one or more transport operations, each transport operation comprising transfer of data, related to a
corresponding processing operation, the selecting based at least in part on priority information associated with the one or
more transport operations or current states of available processing resources allocated to the first memory cluster in each
of a subset of one or more other memory clusters;

initiating the transport of the selected at least one transport operation; and
updating, in at least one other memory cluster, a current state of available processing resources allocated to the first memory
cluster, corresponding to the selected at least one transport operation.

US Pat. No. 9,237,581

APPARATUS AND METHOD FOR MEDIA ACCESS CONTROL SCHEDULING WITH A SORT HARDWARE COPROCESSOR

Cavium, Inc., San Jose, ...

1. An apparatus, comprising:
a Media Access Control (MAC) scheduler that generates a sort request;
a hardware based sort coprocessor that services the sort request in accordance with specified packet processing priority parameters
to generate a sorted array, wherein the hardware based sort coprocessor is configured to prioritize traffic based upon a priority
parameter based on quality of service parameters, channel conditions, wait-in-queue time and timing efficiency;

a shared memory separate from and accessible by each of the MAC scheduler and the hardware based sort coprocessor, wherein
the MAC scheduler generates a sort request that is written to the shared memory, wherein the sort request references a sort
list, a number of elements to sort and a maximum number of sorted elements to return; and

the hardware based sort coprocessor accesses the sort request from the shared memory and operates to service the sort request.

US Pat. No. 9,379,963

APPARATUS AND METHOD OF GENERATING LOOKUPS AND MAKING DECISIONS FOR PACKET MODIFYING AND FORWARDING IN A SOFTWARE-DEFINED NETWORK ENGINE

Cavium, Inc., San Jose, ...

1. An engine for generating lookups and making decisions (LDE) for packet modifying and forwarding in a software-defined network
(SDN) system, the LDE comprising:
a Key Generator configured to generate a lookup key for each input token;
an Output Generator configured to generate an output token by modifying the input token based on content of a lookup result
associated with the lookup key; and

a Template Table for identifying positions of fields in each of the input tokens.

US Pat. No. 9,363,193

VIRTUALIZED NETWORK INTERFACE FOR TCP REASSEMBLY BUFFER ALLOCATION

Cavium, Inc., San Jose, ...

1. A method for dynamically allocating context for Transmission Control Protocol (TCP) reassembly, comprising:
providing a fixed plurality of global common TCP contexts;
reserving for each of one or more virtual network interface card(s) one or more TCP context(s) out of the fixed plurality
of the global common TCP contexts, leaving a reminder of the fixed plurality of the global common TCP contexts comprising
one or more contexts; and

allocating to a virtual network interface card from the one or more virtual network interface card(s) a TCP context from the
reserved one or more TCP contexts when a reassemble able TCP packet is received by the virtual network interface card.

US Pat. No. 9,219,560

MULTI-PROTOCOL SERDES PHY APPARATUS

Cavium, Inc., San Jose, ...

1. A multiprotocol interface comprising:
a physical layer transmitter unit configured to transmit data from at least one of one or more synchronous media access control
layer units and one or more asynchronous media access control layer units; and

a physical layer receiver unit configured to receive data and to deliver the received data to at least one of the one or more
synchronous media access control layer units and the one or more asynchronous media access control layer units.

US Pat. No. 9,430,511

MERGING INDEPENDENT WRITES, SEPARATING DEPENDENT AND INDEPENDENT WRITES, AND ERROR ROLL BACK

Cavium, Inc., San Jose, ...

1. A method of updating a memory with a plurality of memory lines, the memory storing a tree, a plurality of buckets, and
a plurality of rules, the method comprising:
maintaining a copy of the memory with a plurality of memory lines;
writing a plurality of changes to at least one of the tree, the plurality of buckets, and the plurality of rules to the copy;
determining whether each of the plurality of changes is an independent write or a dependent write;
merging independent writes to a same memory line of the copy in a single line write; and
transferring updates from the plurality of memory lines of the copy to the plurality of memory lines of the memory.

US Pat. No. 9,239,753

DRAM ADDRESS PROTECTION

Cavium, Inc., San Jose, ...

1. A memory controller comprising:
a data interface coupled to a memory via a data bus, the data interface configured to write data to an address in the memory;
an address interface coupled to the memory via an address bus, the address interface to provide the address to the memory;
an error code generation module configured to generate an error code based on a function of the data and the address, the
error code being a combination of a check bit code and a gray code; and

an error interface configured to, responsive to a write request for the data, store the error code at a portion of the memory
corresponding to the address.

US Pat. No. 9,065,860

METHOD AND APPARATUS FOR MULTIPLE ACCESS OF PLURAL MEMORY BANKS

Cavium, Inc., San Jose, ...

1. A method of enabling multi-access to a plurality of physical memory banks, the method comprising:
selecting a subset of multiple access requests to be executed in at least one clock cycle over at least one of a number of
access ports connected to the plurality of physical memory banks, the selected subset of access requests addressed to different
physical memory banks, among the plurality of memory banks, each access port coupled to one or more of the plurality of memory
banks, each memory bank accessible by a single access port per clock cycle and each access port accessing a single memory;
and

scheduling the selected subset of access requests, each over a separate access port; wherein each access request includes
a memory address with a first set of bits indicative of a physical memory bank, among the plurality of physical memory banks,
and a second set of bits indicative of a memory row within the physical memory bank indicated by the first set of bits.

US Pat. No. 9,336,328

CONTENT SEARCH MECHANISM THAT USES A DETERMINISTIC FINITE AUTOMATA (DFA) GRAPH, A DFA STATE MACHINE, AND A WALKER PROCESS

Cavium, Inc., San Jose, ...

11. A method comprising:
generating a data structure from one or more regular expressions for a content processing application enabling at least one
processor, configured to process packets in an input stream, to match at least one pattern in the input stream by traversing
the data structure generated with characters from the packets;

including at least one node having an intelligent node structure in the data structure generated, the intelligent node structure
providing information on a next node to traverse and to perform at least one task based on traversing the at least one node,
enabling the at least one processor to generate and check state information at the at least one node to obviate post-processing
overhead of the at least one pattern matched, to improve performance of the at least one processor relative to performing
post-processing of results for the content processing application; and

storing the data structure generated in at least one memory operatively coupled to the at least one processor.

US Pat. No. 9,094,669

VIDEO ENCODER BIT ESTIMATOR FOR MACROBLOCK ENCODING

Cavium, Inc., San Jose, ...

1. A method for predicting an illegal macroblock within a video encoder, the method comprising:
deciding a prediction mode based on motion estimation and spatial prediction for a macroblock within a video image;
performing a transform on data and a specific compression level over the data within the macroblock from the prediction mode
selected;

performing a bit estimation function to predict a size of the macroblock based on the prediction mode and the transformed
data;

comparing the predicted size to a threshold; and
applying a special prediction mode to encode the macroblock to reduce the size of the encoded macroblock when the predicted
size exceeds the threshold.

US Pat. No. 9,065,781

MESSAGING WITH FLEXIBLE TRANSMIT ORDERING

Cavium, Inc., San Jose, ...

1. A system comprising:
a packet reception unit configured to receive a packet, create a header indicating scheduling of the packet in a plurality
of cores, the header based on the content of the packet, and concatenate the header and the packet; and a plurality of reassembly
stores, wherein receiving the packet includes receiving at least one fragment of the packet, storing the at least one fragment
in a particular reassembly store corresponding with the packet, and when the particular reassembly store contains at least
one fragment of the packet such that the stored at least one fragment represents the packet as a whole, forwarding the packet
to the plurality of cores, wherein the packet reception unit is further configured to, store the at least one fragments in
one of a plurality of memories within the reassembly stores, and, when the one of the plurality of memories is filled, copy
the at least one fragments to a memory external to the packet reception unit.

US Pat. No. 9,372,772

CO-VERIFICATION—OF HARDWARE AND SOFTWARE, A UNIFIED APPROACH IN VERIFICATION

CAVIUM, INC., San Jose, ...

1. A method of co-verification of a compiler and a device under test within a test bench verification environment implemented
by a test bench, the method comprising:
providing verification constraint information including code to implement one or more desired design features for the device
under test;

compiling the verification constraint information with the compiler to generate programming values for the device under test;
generating the device under test within the test bench verification environment based on the programming values produced by
the compiler such that the device under test operates according to the programming values; and

performing verification tests on the device under test to test one or more of the desired design features using stimulus from
the test bench thereby testing both the compiler and the device under test.

US Pat. No. 9,275,336

METHOD AND SYSTEM FOR SKIPPING OVER GROUP(S) OF RULES BASED ON SKIP GROUP RULE

Cavium, Inc., San Jose, ...

1. A method for forcing a search processor to skip over rules within a group of rules, the method comprising:
in a compiler provided with a set of rules for matching a key, the set of rules divided into groups, each group being prioritized
with respect to each other and each rule within each group being prioritized with respect to each other, and the set of rules
including at least one skip group rule;

rewriting rules belonging to a same group as the skip group rule and having priorities lower than the skip group rule, the
lower priority rules being rewritten based on the skip group rule such that in response to matching a key to the skip group
rule, a search processor skips over the skip group rule and the lower priority rules;

providing the rewritten rules to the search processor;
wherein rewriting the lower priority rules includes subtracting the skip group rule from each of the lower priority rules,
each lower priority rule being rewritten with a respective subtracted rule as one or more rewritten rules;

wherein subtracting includes, given the skip group rule including a field having a first bitmask and each of the lower priority
rules including a corresponding field having a second bitmask, for each second bitmask, inverting the first bitmask and intersecting
the inverted first bitmask with a subject second bitmask to form an third bitmask, and including the third bitmask in a rewritten
rule;

wherein intersecting the inverted first bitmask with the subject second bitmask includes, bit-by-bit:
intersecting a don't-care bit of the inverted first bitmask with a don't-care bit of the subject second bitmask yields a don't-care
bit in the third bitmask;

intersecting a don't-care bit with a value bit yields the value bit in the third bitmask:
intersecting a value bit with an equal value bit yields the value bit in the third bitmask; and
intersecting a value bit with an unequal value bit yields the third bitmask having a null value.

US Pat. No. 9,495,479

TRAVERSAL WITH ARC CONFIGURATION INFORMATION

Cavium, Inc., San Jose, ...

1. A computer implemented method comprising:
by a processor, given a current node and an arc pointing from the current node to a next node, analyzing arcs in a data structure
to determine which of the arcs are valid arcs pointing from the next node;

by the processor, constructing arc configuration information associated with the next node, the arc configuration information
limited to only arc configuration information of the next node and representing each valid arc pointing from the next node;
and

by the processor, storing the arc configuration information associated with the next node, enabling the arc configuration
information to be evaluated and each of the valid arcs pointing from the next node to be identified from the evaluation of
the arc configuration information without the next node being read to reduce memory accesses and processing time of the processor.

US Pat. No. 9,417,655

FREQUENCY DIVISION CLOCK ALIGNMENT

Cavium, Inc., San Jose, ...

1. A method for generating a clock signal, comprising:
at a root node of a clock distribution network, receiving a first clock signal;
at a first leaf node of a plurality of leaf nodes of the clock distribution network, detecting a reference event and generating
a synchronizing signal based on the detection of the reference event;

passing the synchronizing signal along a synchronizing signal path from the first leaf node to the root node via one or more
clocked storage cells, each storage cell being clocked from a corresponding point within the clock distribution network;

at the root node, generating a second clock signal from the first clock signal synchronized to the synchronizing signal received
at the root node, and distributing the second clock signal to the leaf nodes of the clock distribution network, the generating
of the second clock signal resulting in the second clock signal received at the first leaf node being synchronized to the
detected reference event.

US Pat. No. 9,372,800

INTER-CHIP INTERCONNECT PROTOCOL FOR A MULTI-CHIP SYSTEM

Cavium, Inc., San Jose, ...

1. A method of providing memory coherence between multiple chip devices of a multi-chip system, the method comprising:
maintaining, at a first chip device of the multi-chip system, state information indicative of one or more states of one or
more copies of a data block, the data block stored in a memory associated with one of the multiple chip devices, the one or
more copies of the data block residing in one or more chip devices of the multi-chip system;

receiving, by the first chip device, a message associated with a copy of the one or more copies of the data block from a second
chip device of the multiple chip devices; and

in response to the message received, executing, by the first chip device, a scheme of one or more actions determined based
on the state information maintained at the first chip device and the message received;

wherein the data block is stored in a memory attached to the first chip device and the message is indicative of a state, maintained
at the second chip device, of a copy of the data block residing in the second chip device.

US Pat. No. 9,294,567

SYSTEMS AND METHODS FOR ENABLING ACCESS TO EXTENSIBLE STORAGE DEVICES OVER A NETWORK AS LOCAL STORAGE VIA NVME CONTROLLER

CAVIUM, INC., San Jose, ...

1. A system to support remote storage virtualization, comprising:
a non-volatile memory express (NVMe) storage proxy engine running on a physical NVMe controller, which in operation, is configured
to:

create and map one or more logical volumes in one or more non-volatile memory express (NVMe) namespaces to a plurality of
remote storage devices accessible over a network;

convert the NVMe namespaces of the logical volumes in a first instruction to storage volumes of the remote storage devices
in a second instruction according to a storage network protocol, wherein the first instruction performs a read/write operation
on the logical volumes;

an NVMe access engine running on the physical NVMe controller, which in operation, is configured to
present the NVMe namespaces of the logical volumes to one or more virtual machines (VMs) running on a host as local storage
volumes;

receive said first instruction to perform the read/write operation on the logical volumes from one of the VMs;
present result and/or data of the read/write operation to the VM after the read/write operation has been performed on the
storage volumes of the remote storage devices over the network using the second instruction.

US Pat. No. 9,281,034

DATA STROBE GENERATION

Cavium, Inc., San Jose, ...

1. A method of generating strobe signals, the method comprising:
operating a first multiplexer to select between a control signal and the control signal inverted and delayed;
generating a first strobe signal in a first memory circuit mode by operating a second multiplexer with a clock signal to select
between a first input signal and a second input signal, the first input signal having a static first signal level and the
second input signal corresponding to the control signal output by the first multiplexer; and

generating a second strobe signal in a second memory circuit mode by operating the second multiplexer with the clock signal
to select between the first input signal and the second input signal, the first input signal corresponding to the control
signal inverted and delayed output by the first multiplexer, and the second input signal having a static second signal level.

US Pat. No. 9,130,819

METHOD AND APPARATUS FOR SCHEDULING RULE MATCHING IN A PROCESSOR

Cavium, Inc., San Jose, ...

1. A method of scheduling rule matching threads initiated by a plurality of initiating engines in a network search processor,
for processing by multiple matching engines of the network search processor, the method comprising:
determining, by a scheduler, a set of bundles of rule matching threads, each bundle being initiated by a separate initiating
engine;

distributing rule matching threads in each bundle of the set of bundles into a number of subgroups of rule matching threads;
assigning the subgroups of rule matching threads associated with each bundle of the set of bundles to multiple scheduling
queues; and

sending rule matching threads, assigned to each scheduling queue, toward rule matching engines according to an order based
on priorities associated with the respective bundles of rule matching threads.

US Pat. No. 9,130,549

MULTIPLEXER FLOP

Cavium, Inc., San Jose, ...

1. A flip flop circuit comprising:
a master latch comprising a master storage element, a first data leg, and a second data leg, the first and second data legs
coupled to the master storage element, and clock selection logic coupled to the first and second data legs, the clock selection
logic having a select input for selecting between the first and second data legs; and

a slave latch coupled to the master latch, the slave latch including a slave storage element,
wherein the clock selection logic comprises first and second NOR gates, the first and second NOR gates each sharing and receiving
a single clock input, the first NOR gate receiving the select input, the second NOR gate receiving an inverse of the select
input, the first NOR gate generating a first clock output which enables a first data output of the first data leg to a first
node of the master storage element, and the second NOR gate generating a second clock output which enables a second data output
of the second data leg to the first node of the master storage element.

US Pat. No. 9,503,218

METHOD AND APPARATUS FOR QUANTIZING SOFT INFORMATION USING NON-LINEAR LLR QUANTIZATION

Cavium, Inc., San Jose, ...

1. A method for processing information over a network, comprising: receiving a first set of signals representing a first logic
value from a transmitter via a physical communication channel; demodulating the first set of signals in accordance with a
soft decoding scheme and generating a Log Likelihood Ratio (LLR) value representing the first logic value; generating a quantized
LLR value in response to the LLR value via a non-linear LLR quantizer; storing the quantized LLR value representing compressed
first logic value in a local storage; and retrieving the quantized LLR value from the local storage and decompressing the
quantized LLR value to restore the LLR value when a decoder is ready to decode the LLR value, wherein generating a quantized
LLR value further includes extracting a sign bit from content bits of the LLR value, generating a log result of content by
taking a logarithm in base 2 (log 2) of the content bits of the LLR value, and concatenating the log result with the sign
bit to produce a log 2 quantized LLR value representing the first logic value.

US Pat. No. 9,490,033

AUTO-BLOW MEMORY REPAIR

Cavium, Inc., San Jose, ...

1. A method comprising:
during a logic reset of an integrated circuit, collecting, at a built-in self-repair (BISR) controller within the integrated
circuit, defect information indicative of defects identified during a built-in self-test (BIST) operation performed on plural
memories embedded within the integrated circuit, wherein the collecting the defect information is automated in hardware on
the integrated circuit and the collecting includes sending a defect request from the BISR controller to at least one memory
of the plural memories;

during the logic reset, blowing, by a fuse controller within the integrated circuit, one or more fuses within the integrated
circuit based on the defect information collected, wherein the blowing the one or more fuses by the fuse controller is initiated
and automated in hardware on the integrated circuit;

during the logic reset, using, by the BISR controller, one or more fuses blown to inform a built-in self-repair (BISR) operation
performed on the plural memories by BISR logic, each of the plural memories having BISR logic embedded within.

US Pat. No. 9,141,548

METHOD AND APPARATUS FOR MANAGING WRITE BACK CACHE

Cavium, Inc., San Jose, ...

1. A network services processor comprising:
a plurality of processors;
a coherent shared memory including a cache and a memory, the coherent shared memory shared by the plurality of processors;
a free pool allocator configured to maintain pools of pointers to free memory locations;
an input/output bridge coupled to the plurality of processors and the cache, the input/output bridge intercepting memory free
commands issued by one or more of the processors and destined for the free pool allocator, each memory free command requesting
to free one or more portions in the memory, and based on intercepting a memory free command to request to free a selected
memory portion, issuing a don't write back command to a cache controller, and forwarding the memory free command to the free
pool allocator based on completion of the don't write back command; and

the cache controller coupled to the plurality of processors, the cache and the input/output bridge, the cache controller configured
to receive the don't write back command from the input/output bridge and to compare an address of the selected memory portion
to addresses stored in the cache and, in an event the address of the selected memory portion is replicated in the cache, void
a memory update to the selected memory portion by clearing a dirty bit associated with a corresponding modified cache block.

US Pat. No. 9,128,769

PROCESSOR WITH DEDICATED VIRTUAL FUNCTIONS AND DYNAMIC ASSIGNMENT OF FUNCTIONAL RESOURCES

Cavium, Inc., San Jose, ...

1. A method in a processor having a plurality of hardware resources comprising:
on at least one clock cycle:
setting a mode of the processor, the mode being one of a virtual function mode or a physical function mode;
if the mode of the processor is set to the virtual function mode, determining a plurality of virtual functions corresponding
to the virtual function mode; wherein the virtual functions are configured to share the plurality of hardware resources among
the plurality of virtual functions; and assigning each work store of a plurality of work stores into one of a plurality of
virtual functions;

assigning the plurality of work stores into one physical function if the mode of the processor is set to the physical function
mode;

on each clock cycle:
releasing any idle hardware resource to be available for any virtual function or physical function; and
dispatching work from any work store corresponding to any virtual function or physical function to any released hardware resources;
dispatching including arbitrating the work within each of the plurality of virtual functions to nominate corresponding work
units and arbitrating among each of the corresponding work units to select a work unit for dispatch if the mode is set to
the virtual function mode or arbitrating the work within the physical function to select a work unit for dispatch if the mode
is set to the physical function mode.

US Pat. No. 9,495,161

QOS BASED DYNAMIC EXECUTION ENGINE SELECTION

Cavium, Inc., San Jose, ...

1. A processor comprising:
a plurality of processing cores;
a plurality of instruction stores, each instruction store storing at least one instruction, each instruction having a corresponding
group number, each instruction store having a unique identifier;

a store component storing a group execution matrix and a store execution matrix, the group execution matrix comprising a plurality
of group execution masks, each group execution mask corresponding to a given group number and indicating which cores can process
an instruction from the given group number; the store execution matrix comprising a plurality of store execution masks;

a core selection unit configured to
for each instruction within each instruction store:
select a store execution mask from the store execution matrix based on the unique identifier of a selected instruction store,
select at least one group execution mask from the group execution matrix based on the group number of at least one selected
instruction from the selected instruction store, and

for each selected group execution mask of the at least one group execution masks, define a core request mask based on the
selected group execution mask and the store execution mask, the core request mask corresponding to the selected instruction
store and indicating candidate cores; and

an arbitration unit configured to determine instruction priority among each instruction, each instruction store having at
least one corresponding core request mask, accordingly assign an instruction for each available core, where the core request
mask corresponding to the instruction store of the instruction indicates candidate cores that intersect with the available
cores, and signal the instruction store corresponding to the assigned instruction to send the assigned instruction to the
available core.

US Pat. No. 9,307,057

METHODS AND SYSTEMS FOR RESOURCE MANAGEMENT IN A SINGLE INSTRUCTION MULTIPLE DATA PACKET PARSING CLUSTER

Cavium, Inc., San Jose, ...

1. A method for operating a SIMD packet parsing cluster, wherein the cluster includes a plurality of M>=2 packet parsing engines
1 to M, and the cluster further includes a shared memory and an instruction memory storing a plurality of instructions to
be performed by each of the engines, and wherein the instructions include one or more memory accessing instructions that require
accessing the shared memory, the method comprising:
transmitting the instructions to the engines for the instructions to be executed by the engines;
for each of the engines 2 to M, delaying execution of each of the memory
accessing instructions by a delay time compared to a previous engine; and
each one of the engines performing one of the memory accessing instructions at a time that the other engines are not performing
one of the memory accessing instructions by inserting a lag time between each two consecutive memory accessing instructions
for engine 1, the lag time being greater than or equal to a time enabling the 2 to M engines to complete memory accessing
instruction corresponding to the first of the two consecutive memory accessing instructions.

US Pat. No. 9,330,227

TESTBENCH BUILDER, SYSTEM, DEVICE AND METHOD INCLUDING A DISPATCHER

Cavium Inc., San Jose, C...

1. A testbench system stored on a non-transitory computer readable medium for testing operation of a device under test, the
testbench system comprising:
a plurality of agents coupled with the device under test, wherein one or more of the plurality of agents are configured to
output one or more transactions to the device under test and a different one or more of the plurality of agents are configured
to input one or more device responses to the transactions from the device under test; and

a dispatcher including an agent table and coupled with a reference model, a scoreboard and the plurality of agents, wherein
the dispatcher is configured to:

input data comprising a copy of each of the one or more transactions and the one or more device responses to the transactions;
identify whether each portion of the data is one of the copies of each of the one or more transactions or one of the device
responses based on the agent table; and

route each portion of data identified as one of the copies of each of the one or more transactions to the reference model
and each portion of data identified as one of the one or more device responses to the scoreboard.

US Pat. No. 9,208,103

TRANSLATION BYPASS IN MULTI-STAGE ADDRESS TRANSLATION

Cavium, Inc., San Jose, ...

1. A circuit comprising:
a cache configured to store translations between address domains, the cache addressable as a first logical portion and a second
logical portion, the first logical portion configured to store translations between a first address domain and a second address
domain, the second logical portion configured to store translations between the second address domain and a third address
domain; and

a processor configured to 1) control a bypass of at least one of the first and second logical portions with respect to a translation
of an address indicated by an address request, the processor determining whether to activate the bypass based on an attribute
of the address indicated by the address request, and 2) match the address request against a non-bypassed portion of the cache
in accordance with the bypass and output a corresponding address result.

US Pat. No. 9,058,463

SYSTEMS AND METHODS FOR SPECIFYING. MODELING, IMPLEMENTING AND VERIFYING IC DESIGN PROTOCOLS

CAVIUM, INC., San Jose, ...

1. A system, comprising:
a formal verification engine running on a host, which in operation, automatically generates and formally verifies a reference
specification that includes a plurality of extended state tables for an integrated circuit (IC) design protocol of a chip
at architectural level;

a protocol checking engine running on a host, which in operation, checks and validates completeness and correctness of the
reference specification;

a micro architect engine running on a host, which in operation, implements the IC design protocol at the micro-architectural
level using a synthesizable package generated from the formally verified reference specification;

a dynamic verification (DV) engine running on a host, which in operation, dynamically verifies the implementation of the IC
design protocol at the micro-architectural level and incorporates all incremental changes to the IC design protocol in real
time based on a DV reference model generated from the extended state tables of the reference specification.

US Pat. No. 9,501,243

METHOD AND APPARATUS FOR SUPPORTING WIDE OPERATIONS USING ATOMIC SEQUENCES

Cavium, Inc., San Jose, ...

1. A method comprising:
initiating, by a processor, an atomic sequence by executing an operation designed to initiate the atomic sequence and allocate
a memory buffer;

storing one or more data words in a concatenation of one or more memory locations by executing a conditional storing operation,
the conditional storing operation being designed to automatically check the memory buffer allocated for any data stored therein,
and store the one or more data words based on a result of checking the memory buffer; wherein the operation designed to initiate
the atomic sequence is a load operation designed to initiate the atomic sequence, and executing the load operation designed
to initiate the atomic sequence includes loading a data word; and

storing data in the memory buffer allocated by executing one or more regular storing operations within the atomic sequence,
wherein storing one or more data words by executing a conditional storing operation includes:

storing one or more first data words, associated with data stored in the memory buffer, and a second data word in a concatenation
of two or more memory locations, the one or more first data words and the second data word having a cumulative width greater
than a data word width associated with the processor.

US Pat. No. 9,483,100

METHOD AND APPARATUS FOR POWER GATING HARDWARE COMPONENTS IN A CHIP DEVICE

Cavium, Inc., San Jose, ...

1. A semiconductor device comprising:
a hardware component;
a transistor, coupled to the hardware component, for gating power supply to the hardware component; and
a controller configured to operate the transistor in a manner to limit electric current dissipated to the hardware component
during a transition period by gradually decreasing, or gradually increasing, a magnitude of an input signal of the transistor
during the transition period.

US Pat. No. 9,460,033

APPARATUS AND METHOD FOR INTERRUPT COLLECTING AND REPORTING STATUS AND DELIVERY INFORMATION

Cavium, Inc., San Jose, ...

1. A method for interrupt collecting an reporting, comprising:
storing for each of at least one interrupt a status indicator, an enable status, and an interrupt delivery information in
a first structure;

storing for each of the at least one interrupt at least an indicator of one or more entities to execute an interrupt handler
routine in a second structure; and

reporting one of the at least one interrupt to the one or more entities to execute an interrupt handler routine designated
in accordance with the status indicator, the enable status, the interrupt delivery information, and the at least one indicator.

US Pat. No. 9,087,567

METHOD AND APPARATUS FOR AMPLIFIER OFFSET CALIBRATION

Cavium, Inc., San Jose, ...

1. A method of calibrating an amplifier offset, the method comprising:
applying an input value to both input leads of an amplifier, the amplifier including one or more digital-to-analog converters
(DACs) used to calibrate an offset of the amplifier;

updating, over a number of iterations by a control logic coupled to the amplifier, a digital value based on an output of the
amplifier, the digital value updated being provided as input to a DAC of the one or more DACs in the amplifier; and

employing a final value of the digital value as input to the DAC of the one or more DACs in the amplifier for calibrating
the offset of the amplifier during a data reception phase.

US Pat. No. 9,506,982

TESTBENCH BUILDER, SYSTEM, DEVICE AND METHOD INCLUDING A GENERIC MONITOR AND TRANSPORTER

Cavium, Inc., San Jose, ...

1. A testbench system stored on a non-transitory computer readable medium for testing operation of a device under test, the
testbench system comprising:
a verification environment including a scoreboard and a reference model; and
a plurality of agents operating within the verification environment, wherein each of the agents comprise:
a generic monitor having monitor code and configured to monitor one or more transactions transmitted on an interface between
the agent and the device under test using the monitor code; and

a transporter coupled with the generic monitor and coupled with the device under test via the interface, wherein the monitor
and the transporter are configured to together perform a handshake protocol with the device under test over the interface
based on a class of the interface using the monitor code and transporter code of the transporter, and the transporter samples
at least one of the transactions based on an outcome of the handshake protocol using the transporter code, wherein using the
monitor code the monitor is configured to forward the at least one of the transactions to the scoreboard or the reference
model based on the outcome, wherein the monitor code and the transporter code are stored on the non-transitory computer readable
medium.

US Pat. No. 9,390,209

SYSTEM FOR AND METHOD OF COMBINING CMOS INVERTERS OF MULTIPLE DRIVE STRENGTHS TO CREATE TUNE-ABLE CLOCK INVERTERS OF VARIABLE DRIVE STRENGTHS IN HYBRID TREE-MESH CLOCK DISTRIBUTION NETWORKS

CAVIUM, INC., San Jose, ...

1. A computer-aided design process for manufacturing a semiconductor device having a clock distribution network thereon, the
method comprising:
determining target drive strengths of clock signals for multiple sequential components on the semiconductor device;
determining groups of standard clock-driving elements on the semiconductor device, wherein each of the groups has a group
drive strength equal to a sum of the drive strengths of the clock-driving elements in the group, each of the group drive strengths
substantially equal to one of the target drive strengths;

determining a fabrication process for combining the clock-driving elements into the groups; and
fabricating the clock distribution network on the semiconductor device according to the fabrication process, wherein the clock
distribution network includes a plurality of standard clock-driving elements in a first layer of the semiconductor device
and output pins of each of the standard clock-driving elements in a second layer, and further wherein the fabricating comprises,
for each of the determined groups, electrically coupling together the standard clock-driving elements of that group with vias
between the first layer and the second layer.

US Pat. No. 9,264,385

MESSAGING WITH FLEXIBLE TRANSMIT ORDERING

Cavium, Inc., San Jose, ...

1. A system comprising:
a plurality of reassembly stores configured to store at least one fragment of a packet in a particular reassembly store corresponding
with the packet, and when the particular reassembly store contains at least one fragment of the packet representing the packet
as a whole, forward the packet to a plurality of cores; and

a packet reception unit configured to store the at least one fragment in one of a plurality of memories within the reassembly
stores, and, when the one of the plurality of memories is filled, copy the at least one fragment to a memory external to the
packet reception unit.

US Pat. No. 9,195,939

SCOPE IN DECISION TREES

Cavium, Inc., San Jose, ...

1. A method comprising:
compiling a decision tree data structure including a plurality of nodes using a classifier table having a plurality of rules
representing a search space for packet classification, the plurality of rules having at least one field, the plurality of
nodes each covering a portion of the search space by representing successively smaller subsets of the plurality of rules with
increasing depth in the decision tree data structure;

for each node of the decision tree data structure, (a) computing a node scope value indicating a node portion of the search
space covered by the node; (b) for each rule intersecting the node, computing a rule scope value indicating a rule portion
of the node portion covered by the rule; (c) comparing the node portion of the search space covered by the node to an amount
of the node portion covered by rules intersecting the node by computing a scope factor for the node based on the node scope
value computed and the rule scope value computed for each rule; and

using the scope factor computed for at least one node of the plurality of nodes as an input parameter to a decision for performing
a compiler operation at the at least one node.

US Pat. No. 9,183,244

RULE MODIFICATION IN DECISION TREES

Cavium, Inc., San Jose, ...

1. A method comprising:
receiving an incremental update specifying a new rule definition including a modification of at least one field of a designated
rule of a plurality of rules, the plurality of rules being represented by a Rule Compiled Data Structure (RCDS) as a decision
tree for packet classification utilized by an active search process, the plurality of rules representing a search space for
the packet classification, each rule of the plurality of rules having an original rule definition defining a subset of the
search space;

determining an intersection of the new rule definition specified and the original rule definition of the designated rule;
setting the original rule definition of the designated rule to an intermediate rule definition defined by the intersection
determined and incorporating a series of one or more updates determined in the RCDS, atomically from the perspective of the
active search process; and

setting the intermediate rule definition of the designated rule to the new rule definition and incorporating the series of
one or more updates determined in the RCDS, atomically from the perspective of the active search process.

US Pat. No. 9,143,140

MULTI-FUNCTION DELAY LOCKED LOOP

Cavium, Inc., San Jose, ...

1. A delay circuit comprising:
a delay line configured to receive a clock signal and output a delayed clock signal;
a delay controller configured to control the delay line to output the delayed clock signal at a quadrature delay relative
to the clock signal;

a multiplexer receiving a plurality of delay signals, the delay signals including the clock signal and the delayed clock signals;
a state machine configured to control the multiplexer to select one of the delay signals to provide signal leveling among
a plurality of associated output signals; and

a second delay line configured to receive a data strobe signal and output a delayed data strobe signal.

US Pat. No. 9,491,099

LOOK-ASIDE PROCESSOR UNIT WITH INTERNAL AND EXTERNAL ACCESS FOR MULTICORE PROCESSORS

Cavium, Inc., San Jose, ...

1. A method for information lookup request processing at a look-aside processor unit, comprising:
storing a received lookup transaction request in a first look-aside buffer of a first storage;
rebuilding the lookup transaction request into a request packet;
transmitting the request packet;
receiving a packet;
comparing ternary values of specific bits in the received packet with ternary values of bits in a pre-determined mask;
determining the received packet to be a response packet when the bits in the received packet disagree with the bits in the
pre-determined mask; and

determining the received packet to be an exception packet when the specific bits in the received packet agree with the bits
in the pre-determined mask; and

processing the received packet in accordance with the determining.

US Pat. No. 9,411,361

FREQUENCY DIVISION CLOCK ALIGNMENT USING PATTERN SELECTION

Cavium, Inc., San Jose, ...

1. A method for generating a clock signal, comprising:
at a root node of a clock distribution network, receiving a first clock signal generated based on a reference clock signal;
at a first leaf node of a plurality of leaf nodes of the clock distribution network, detecting a reference event associated
with the reference clock signal and generating a synchronizing signal based on the detection of the reference event;

passing the synchronizing signal from the first leaf node to the root node;
at the root node, generating a second clock signal from the first clock signal synchronized to the synchronizing signal received
at the root node, and distributing the second clock signal to the leaf nodes of the clock distribution network, the generating
of the second clock signal including selecting a repeating pattern of cycles of the first clock signal, wherein the repeating
pattern includes fewer than all of the cycles of the first clock signal, and at least every cycle of the first clock signal
that is shifted in time by a propagation delay with respect to a rising edge of the reference clock signal or every cycle
of the first clock signal that is shifted in time by a propagation delay with respect to a falling edge of the reference clock
signal.

US Pat. No. 9,419,943

METHOD AND APPARATUS FOR PROCESSING OF FINITE AUTOMATA

Cavium, Inc., San Jose, ...

1. A security appliance operatively coupled to a network, the security appliance comprising:
at least one memory configured to store at least one finite automaton including a plurality of nodes generated from at least
one regular expression pattern;

at least one processor operatively coupled to the at least one memory and configured to walk the at least one finite automaton,
with segments of an input stream received via the network, to match the at least one regular expression pattern in the input
stream, the walk including:

walking at least two nodes of a given finite automaton, of the at least one finite automaton, in parallel, with a segment,
at a given offset within a payload, of a packet in the input stream, to optimize performance of run time processing of the
at least one processor for identifying an existence of the at least one regular expression pattern in the input stream;

determining a match result for the segment, at the given offset within the payload, at each node of the at least two nodes;
and

determining at least one subsequent action for walking the given finite automaton, based on an aggregation of each match result
determined.

US Pat. No. 9,268,694

MAINTENANCE OF CACHE AND TAGS IN A TRANSLATION LOOKASIDE BUFFER

Cavium, Inc., San Jose, ...

1. A circuit comprising:
a first cache configured to store translations between address domains, the first cache including first and second logical
portions, the first logical portion configured to store translations between a first address domain and a second address domain,
the second logical portion configured to store translations between the second address domain and a third address domain;

a second cache configured to store translations between the first address domain and the third address domain based on entries
in the first cache;

a third cache configured to store tags associated with the translations of the first and second cache; and
a processor configured to 1) write an entry to the third cache, the entry including a subset of fields populated from a corresponding
translation stored at the second cache, and 2) detect a deleted entry in at least one of the first logical portion and the
second logical portion and invalidate corresponding entries in the second and third caches;

wherein the processor is further configured, in response to detecting an absence of a matching entry in the second cache,
to match the address request against the first cache, the address result corresponding to an entry in the first cache.

US Pat. No. 9,263,151

MEMORY INTERFACE WITH SELECTABLE EVALUATION MODES

Cavium, Inc., San Jose, ...

1. A memory controller circuit comprising:
a pattern generator circuit configured to output a predetermined pattern signal;
a flip-flop configured to latch a received data signal at a first terminal of the memory controller circuit and output a looped-back
latched data signal; and

a multiplexer configured to select among the pattern signal in a first mode of operation, the looped-back latched data signal
in a second mode of operation, and a data output signal in a third mode of operation for output at a second terminal of the
memory controller, the second terminal being connected to an external testing device in at least one of the first and second
modes of operation and a memory device in the third mode of operation.

US Pat. No. 9,501,245

SYSTEMS AND METHODS FOR NVME CONTROLLER VIRTUALIZATION TO SUPPORT MULTIPLE VIRTUAL MACHINES RUNNING ON A HOST

CAVIUM, INC., San Jose, ...

1. A system to support non-volatile memory express (NVMe) controller virtualization, comprising:
an NVMe managing engine running on a host, which in operation, is configured to
create and initialize one or more virtual NVMe controllers running on a single physical NVMe controller, wherein each of the
virtual NVMe controllers is configured to support one of a plurality of virtual machines (VMs) running on the host to access
its storage units;

subsequently add or remove one or more of the virtual NVMe controllers based on the number of VMs running on the host and
physical limitations of the physical NVMe controller to support the virtual NVMe controllers;

said virtual NVMe controllers running on the single physical NVMe controller, wherein each of the virtual NVMe controllers
is configured to:

establish a logical volume of storage units having one or more corresponding namespaces for one of the VMs to access;
retrieve and process commands and/or data from the VM to access the namespaces or the logical volume of storage units for
the VM;

provide processing results of the commands and/or data back to the VM via the virtual NVMe controller.

US Pat. No. 9,430,268

SYSTEMS AND METHODS FOR SUPPORTING MIGRATION OF VIRTUAL MACHINES ACCESSING REMOTE STORAGE DEVICES OVER NETWORK VIA NVME CONTROLLERS

CAVIUM, INC., San Jose, ...

1. A computer-implemented method to support migration of virtual machines (VMs) accessing a set of remote storage devices
over a network via non-volatile memory express (NVMe) controllers, comprising:
enabling a first VM running on a first host to access and perform a plurality of storage operations to one or more logical
volumes in one or more NVMe namespaces created and mapped to the remote storage devices accessible by a first virtual NVMe
controller only over the network as if they were local storage volumes following a storage network protocol;

putting said first virtual NVMe controller running on a first physical NVMe controller currently serving the first VM into
a quiesce state when the first VM is being migrated from the first host to a second VM running on a second host;

capturing and saving an image of states of the first virtual NVMe controller on the first host;
creating a second virtual NVMe controller on a second physical NVMe controller using the saved image, wherein the second virtual
NVMe controller is configured to serve the second VM and has exactly the same states as the first virtual NVMe controller
in the quiesce state;

initiating and/or resuming the storage operations to the logical volumes mapped to the remote storage devices accessible by
the second virtual NVMe controller over the network without being interrupted by the migration of the first VM running on
the first host to the second VM running on the second host.

US Pat. No. 9,268,855

PROCESSING REQUEST KEYS BASED ON A KEY SIZE SUPPORTED BY UNDERLYING PROCESSING ELEMENTS

CAVIUM, INC., San Jose, ...

1. A method, executed by one or more processors, for processing a data packet, the method comprising:
receiving the packet;
creating a first request key using information extracted from the packet;
splitting the first request key into an n number of partial request keys if at least one predetermined criterion is met, wherein
n>1 and each of the n number of partial request keys is associated with a distinct set of the information extracted from the
packet;

sending a non-final request that includes an i-th partial request key to a corresponding search table of an n number of search
tables, wherein i
receiving a non-final search result from the corresponding search table;
sending a final request that includes an n-th partial request key and the non-final search result received in response to
sending the non-final request to the corresponding search table;

receiving a final search result from the corresponding search table; and
processing the packet based on processing data included in the final search result.

US Pat. No. 9,473,601

METHOD OF REPRESENTING A GENERIC FORMAT HEADER USING CONTINUOUS BYTES AND AN APPARATUS THEREOF

CAVIUM, INC., San Jose, ...

1. A method of a rewrite engine, the method comprising:
detecting missing fields from a protocol header of an incoming packet;
based on the detection, expanding the protocol header to a generic format for a corresponding protocol, wherein the generic
format includes all possible fields that the corresponding protocol can have; and

maintaining a data structure for the expanded protocol header, wherein the data structure includes a first field and a second
field, wherein the first field indicates a number of contiguous valid bytes from a start of the expanded protocol header,
and the second field is a bit vector indicating which validity of each byte after the contiguous valid bytes in the expanded
protocol header.

US Pat. No. 9,349,434

VARIABLE STROBE FOR ALIGNMENT OF PARTIALLY INVISIBLE DATA SIGNALS

Cavium, Inc., San Jose, ...

1. A method of sampling data signals in response to a timing signal, said method comprising:
receiving data signals, wherein each of said data signals includes a valid-data window having an extent, wherein, when a data
signal is received, said valid-data window includes an invisible portion and a visible portion, wherein said invisible portion
is outside an observation window, and said visible portion is inside said observation window, wherein said data signals are
skewed relative to each other, for each of said data signals, identifying a designated location within said valid-data window,
wherein said designated location is part way across said extent of said valid-data window, and for each of said data signals,
aligning said data signal such that said designated location aligns with said timing signal.

US Pat. No. 9,208,438

DUPLICATION IN DECISION TREES

Cavium, Inc., San Jose, ...

1. A method comprising:
building a decision tree structure representing a plurality of rules using a classifier table having the plurality of rules,
the plurality of rules having at least one field;

including a plurality of nodes in the decision tree structure, each node representing a subset of the plurality of rules,
each node having a leaf node type or a non-leaf node type;

linking each node having the leaf node type to a bucket, each node having the leaf node type being a leaf node, the bucket
representing the subset of the plurality of rules represented by the leaf node;

cutting each node having the non-leaf node type on one or more selected bits of a selected one or more fields of the at least
one field creating one or more child nodes having the non-leaf node type or the leaf node type, each node cut being a parent
node of the one or more child nodes created, the one or more child nodes created representing one or more rules of the parent
node;

identifying duplication in the decision tree structure;
modifying the decision tree structure based on the identified duplication; and
storing the modified decision tree structure.

US Pat. No. 9,129,060

QOS BASED DYNAMIC EXECUTION ENGINE SELECTION

Cavium, Inc., San Jose, ...

1. A processor comprising:
a plurality of processing cores;
a plurality of instruction stores, each instruction store storing at least one instruction, each instruction having a corresponding
group number, each instruction store having a unique identifier;

a store component storing a group execution matrix and a store execution matrix, the group execution matrix comprising a plurality
of group execution masks, each group execution mask corresponding to a given group number and indicating which cores can process
an instruction from the given group number; the store execution matrix comprising a plurality of store execution masks;

a core selection unit configured to for each instruction within each instruction store:
select a store execution mask from the store execution matrix using the unique identifier of a selected instruction store
as an index, select at least one group execution mask from the group execution matrix using the group number of at least one
selected instruction from the selected instruction store as an index, and

for each selected group execution mask of the at least one group execution masks, perform logic operations on the selected
group execution mask and the store execution mask to create a core request mask, the core request mask corresponding to the
selected instruction store and indicating zero, one, or more candidate cores; and

an arbitration unit configured to determine instruction priority among each instruction, each instruction store having at
least one corresponding core request mask, accordingly assign an instruction for each available core, where the core request
mask corresponding to the instruction store of the instruction indicates candidate cores that intersect with the available
cores, and signal the instruction store corresponding to the assigned instruction to send the assigned instruction to the
available core.

US Pat. No. 9,191,321

PACKET CLASSIFICATION

Cavium, Inc., San Jose, ...

1. A method comprising:
in a processor, building a decision tree structure including a plurality of nodes, each node representing a subset of a plurality
of rules having at least one field;

for at least one node of the decision tree structure, determining a number of cuts that may be made on each at least one field
creating child nodes equal to the number of cuts and selecting a field on which to cut the at least one node based on a comparison
of an average of a difference between an average number of rules per child node created and an actual number of rules per
child node created per each at least one field;

cutting the at least one node into a number of child nodes on the selected field; and
storing the decision tree structure in a memory, wherein selecting the field on which to cut the at least one node based on
the comparison enables the processor to build a wider, shallower decision tree structure relative to selecting the field on
which to cut not based on the comparison, reducing a search time of a search performed using the decision tree structure stored
in the memory.

US Pat. No. 9,613,679

CONTROLLED DYNAMIC DE-ALIGNMENT OF CLOCKS

Cavium, Inc., San Jose, ...

1. A method for controlling operation of a system that extends across plural clock domains, wherein said system comprises
a first functional unit that is in a first clock domain and a second functional unit that is in a second clock domain, said
method comprising:
driving said first functional unit that comprises a delay-locked loop with a first clock-signal,
driving said second functional unit with a second clock-signal that has a first temporal offset relative to said first clock-signal,
retrieving information from said delay-locked loop, and using said information retrieved from said delay-locked loop to control
a source of said second clock-signal, and

dynamically adjusting a temporal offset between said first clock-signal and said second clock-signal, wherein dynamically
adjusting said temporal offset between said first clock-signal and said second clock-signal comprises adjusting by an amount
that is insufficient to cause at least one of: (1) a glitch in said first or second clock-signal that increases a number of
pulses in said first or second clock-signal, or (2) a cycle compression in said first or second clock-signal that compresses
a margin beyond a specified value.

US Pat. No. 9,413,568

METHOD AND APPARATUS FOR CALIBRATING AN INPUT INTERFACE

Cavium, Inc., San Jose, ...

15. An input/output interface having multiple single-ended receivers, the input/output interface comprising:
an amplifier in each single ended receiver of the multiple single-ended receivers; and
a control logic,
the amplifiers and the control logic being configured to:
apply amplifier offset calibration to each of the amplifiers during a first phase, the amplifier offset calibration iteratively
determines a value for calibrating an internal voltage offset of the amplifier of each of the multiple single-ended receivers
to produce a value;

employing the determined value for each amplifier to the amplifier corresponding to each of the multiple single-ended receivers
during an active phase of the input/output interface;

apply reference voltage calibration to one single-ended receiver of the multiple single-ended receivers to determine a calibration
reference voltage value during a second phase; and

employ the calibration reference voltage value determined for the one single-ended receiver in each of the multiple single-ended
receivers during the active phase of the input/output interface.

US Pat. No. 9,405,702

CACHING TLB TRANSLATIONS USING A UNIFIED PAGE TABLE WALKER CACHE

Cavium, Inc., San Jose, ...

1. An apparatus comprising:
a core configured to execute memory instructions that access data stored in physical memory based on virtual addresses translated
to physical addresses based on a hierarchical page table having multiple levels that each store different intermediate results
for determining final mappings between virtual addresses and physical addresses; and

a memory management unit (MMU) coupled to the core, the MMU including a first cache that stores a plurality of the final mappings
of the page table, a page table walker that traverses the levels of the page table to provide intermediate results associated
with respective levels for determining the final mappings, and a second cache that stores a limited number of intermediate
results provided by the page table walker;

wherein the MMU is configured to compare a portion of the first virtual address to portions of entries in the second cache,
in response to a request from the core to invalidate a first virtual address;

wherein the comparison is based on a match criterion that depends on the level associated with each intermediate result stored
in an entry in the second cache; and

wherein the MMU is configured to remove any entries in the second cache that satisfy the match criterion.

US Pat. No. 9,411,644

METHOD AND SYSTEM FOR WORK SCHEDULING IN A MULTI-CHIP SYSTEM

Cavium, Inc., San Jose, ...

1. A multi-chip system comprising:
multiple chip devices, at least one chip device of the multiple chip devices includes a work source component comprising a
core processor or a coprocessor configured to create work items; and

one or more scheduler processors each associated with a corresponding chip device of the multiple chip devices, a scheduler
processor of the one or more scheduler processors configured to assign a work item to a destination chip device of the multiple
chip devices for processing upon receiving the work item from a work source component associated with a source chip device
of the multiple chip devices.

US Pat. No. 9,811,467

METHOD AND AN APPARATUS FOR PRE-FETCHING AND PROCESSING WORK FOR PROCESOR CORES IN A NETWORK PROCESSOR

Cavium, Inc., San Jose, ...

1. A method for pre-fetching and processing work for processor cores in a network processor, comprising:
requesting work pre-fetch by a requestor comprising one of the processor cores executing a software entity;
determining that work may be pre-fetched for processing by the requestor;
searching for work by determining whether any of a plurality of groups associated with the requestor have work;
pre-fetching the work from one of the plurality of groups into one of one or more pre-fetch work-slots associated with the
requestor when the work is found;

determining whether the pre-fetched work comprises a tag comprising metadata for the work;
comparing the tag with a tag of another work scheduled for the same requestor when the pre-fetched work comprises the tag;
processing the pre-fetched work when the comparing indicates that the pre-fetched work is atomic or ordered or when the pre-fetched
work does not comprise the tag; and

requesting another work pre-fetch when the comparing indicates that the pre-fetched work is not atomic or ordered.

US Pat. No. 9,612,950

CONTROL PATH SUBSYSTEM, METHOD AND DEVICE UTILIZING MEMORY SHARING

Cavium, Inc., San Jose, ...

1. A control path subsystem comprising:
a non-transitory computer-readable control path memory logically organized into a plurality of control memory pools that each
comprise a plurality of tiles; and

control path logic communicatively coupled with the control path memory, wherein the control path logic controls the writing
of the control path packet data into and the reading of the control path packet data out of the control path memory, and further
wherein the control path logic transmits a request for a portion of a datapath memory to a control path unit of a memory allocation
element when a quantity of the control path memory that is currently storing the control path packet data reaches a threshold
value, wherein the datapath memory is a part of a datapath subsystem including datapath logic that controls the writing of
datapath packet data into and the reading of the datapath packet data out of the datapath memory, wherein if the control path
logic is notified by the control path unit of the memory allocation element that a portion of the control path memory has
been allocated to the datapath subsystem, the control path logic facilitates the writing of the datapath packet data into
and the reading of the datapath packet data out of the portion of the control path memory based on control path commands received
from the datapath logic.

US Pat. No. 9,378,033

METHOD AND APPARATUS FOR A VIRTUAL SYSTEM ON CHIP

Cavium, Inc., San Jose, ...

1. A device comprising:
a plurality of virtual systems on chip, each virtual system on chip (VSoC) relating to a subset of a plurality of processing
cores on a single physical chip enabling embedded multi-core virtualization of the single physical chip and a configuring
unit arranged to:

assign a unique identification tag of a plurality of identification tags to each VSoC;
assign memory subsets of a given memory to each of the plurality of virtual systems on chip;
assign each memory subset a given identification tag of the plurality of identification tags, the given identification tag
assigned to a corresponding VSoC to which the memory subset is assigned; and

provide a granularity of memory protection based on a number of the plurality of identification tags assigned;
further comprising a plurality of access control elements on the single physical chip, wherein the configuring unit is further
arranged to set each access control element to control whether a given VSoC of the plurality of virtual systems on chip is
enabled to access a given at least one location of the given memory.

US Pat. No. 9,652,505

CONTENT SEARCH PATTERN MATCHING USING DETERMINISTIC FINITE AUTOMATA (DFA) GRAPHS

Cavium, Inc., San Jose, ...

11. A method comprising:
matching, by at least one processor operatively coupled to at least one network interface, at least one pattern in an input
stream of data by traversing a data structure generated from one or more regular expressions for a content processing application,
the input stream received via the at least one network interface; and

storing the generated data structure in at least one memory operatively coupled to the at least one processor, the generated
data structure including at least one node providing information to perform at least one task based on traversing the at least
one node, enabling the at least one processor to generate and check state information at the at least one node for the matching
to obviate post-processing overhead of the at least one pattern matched, improving performance of the at least one processor
relative to performing post-processing of results for the content processing application.

US Pat. No. 9,438,561

PROCESSING OF FINITE AUTOMATA BASED ON A NODE CACHE

Cavium, Inc., San Jose, ...

10. A security appliance operatively coupled to a network, the security appliance comprising:
at least one network interface;
a plurality of memories in a memory hierarchy configured to store a plurality of nodes of at least one finite automaton for
identifying existence of at least one regular expression pattern in an input stream received via the at least one network
interface;

a node cache configured to store at least a threshold number of nodes of the at least one finite automaton; and
at least one processor operatively coupled to the at least one network interface, the plurality of memories, and the node
cache, and configured to cache a given node and one or more additional nodes, of the plurality of nodes, stored in a given
memory of the plurality of memories at a hierarchical level in the memory hierarchy, in the node cache based on a cache miss
of the given node, the one or more additional nodes cached based on a hierarchical node transaction size associated with the
hierarchical level, optimizing match performance of the at least one processor for identifying the existence of the at least
one regular expression pattern in the input stream.

US Pat. No. 9,276,846

PACKET EXTRACTION OPTIMIZATION IN A NETWORK PROCESSOR

Cavium, Inc., San Jose, ...

1. A method of processing a packet comprising:
receiving a first segment of a packet;
determining, based on the first segment, beats of a second segment of the packet containing portions of a key;
determining a start time at which the key may begin to be forwarded as a continuous stream, the start time being based on
a prediction of when the portions of the key in the second segment will be received; and

initiating forwarding the key at the start time to a processing cluster configured to operate rule matching for the packet;
and

completing forwarding the key at an end time occurring after receipt of all of the portions of the key in the second segment.

US Pat. No. 10,558,573

METHODS AND SYSTEMS FOR DISTRIBUTING MEMORY REQUESTS

Cavium, LLC, San Jose, C...

1. A computer-implemented method, comprising:accessing a memory request comprising an address, wherein the memory request comprises a first operation associated with an instance of data;
selecting a group of caches from a plurality of groups of caches using a bit in the address;
selecting a cache in the group of caches using a first hash of the address; and
when the first operation results in a cache miss, performing a second operation to access a memory outside the cache, and otherwise processing the memory request at the cache according to the first operation.

US Pat. No. 9,531,690

METHOD AND APPARATUS FOR MANAGING PROCESSING THREAD MIGRATION BETWEEN CLUSTERS WITHIN A PROCESSOR

Cavium, Inc., San Jose, ...

1. A method of managing processing thread migrations within a plurality of memory clusters, the method comprising:
embedding, in memory components of the plurality of memory clusters, instructions indicative of processing thread migrations,
wherein the instructions indicative of processing thread migrations include instructions preventing migrating a processing
thread to a memory cluster from which the processing thread migrated previously; and

processing one or more processing threads, in one or more of the plurality of memory clusters, in accordance with at least
one of the embedded migration instructions.

US Pat. No. 9,525,630

METHOD AND APPARATUS FOR ASSIGNING RESOURCES USED TO MANAGE TRANSPORT OPERATIONS BETWEEN CLUSTERS WITHIN A PROCESSOR

Cavium, Inc., San Jose, ...

1. A method comprising:
receiving information indicative of allocation to a first memory cluster of a subset of processing resources in each of one
or more other memory clusters;

storing, in the first memory cluster, the information indicative of resources allocated to the first memory cluster; and
facilitating management of transport operations between the first memory cluster and the one or more other memory clusters
based at least in part on the information indicative of resources allocated to the first memory cluster, each transport operation
comprising transfer of data, related to a corresponding processing operation, between the first memory cluster and one of
the other memory clusters, work for the corresponding processing operation at least partially executed on the first memory
cluster.

US Pat. No. 9,772,952

METHOD AND SYSTEM FOR COMPRESSING DATA FOR A TRANSLATION LOOK ASIDE BUFFER (TLB)

CAVIUM, INC., San Jose, ...

1. A method of memory management, the method comprising:
storing compressed virtual address related data in a translation look aside buffer (TLB); and
accessing physical memory using the compressed virtual address related data.

US Pat. No. 9,569,366

SYSTEM AND METHOD TO PROVIDE NON-COHERENT ACCESS TO A COHERENT MEMORY SYSTEM

Cavium, Inc., San Jose, ...

1. A system comprising:
a memory;
a memory controller providing a cache access path to the memory and a bypass-cache access path to the memory, the memory controller
receiving requests to access finite automata (FA) data at the memory on the bypass-cache access path and receiving requests
to access non-FA data at the memory on the cache access path, the finite automata (FA) data including non-deterministic finite
automata (NFA) data.

US Pat. No. 9,864,582

CODE PROCESSOR TO BUILD ORTHOGONAL EXECUTION BLOCKS FOR PROGRAMMABLE NETWORK DEVICES

Cavium, Inc., San Jose, ...

1. A processing network for receiving a source code including a plurality of IF clauses, the network comprising:
a plurality of processing elements on a programmable microchip;
a plurality of on-chip routers on the microchip for routing the data between the processing elements, wherein each of the
on-chip routers is communicatively coupled with one or more of the processing elements; and

a compiler stored on a non-transitory computer-readable memory and comprising a parser that based on one or more conditions
and one or more assignments of the source code generates a control tree by, for each one of the plurality of IF clauses within
the source code, utilizing flags to determine if the IF clause is serial to or nested within one or more preceding IF clauses
of the plurality of IF clauses within the source code.

US Pat. No. 9,507,563

SYSTEM AND METHOD TO TRAVERSE A NON-DETERMINISTIC FINITE AUTOMATA (NFA) GRAPH GENERATED FOR REGULAR EXPRESSION PATTERNS WITH ADVANCED FEATURES

Cavium, Inc., San Jose, ...

1. A method of walking a non-deterministic finite automata (NFA) graph representing a pattern, the method comprising:
by a processor, extracting a node type, a next node address, and an element from a node of the NFA graph; and
by the processor, matching a segment of a payload with the element by matching the payload with the element at least zero
times, a number of the at least zero times based on the node type, wherein extracting the node type, next node address, and
the element from the node enable the processor to identify the pattern in the payload with less nodes relative to another
NFA graph representing the pattern.

US Pat. No. 9,306,916

SYSTEM AND A METHOD FOR A REMOTE DIRECT MEMORY ACCESS OVER CONVERGED ETHERNET

CAVIUM, INC., San Jose, ...

1. A method for remote direct memory access, comprising:
generating a remote direct memory access packet comprising an opaque data comprising an encrypted stream identifier and a
digest, a virtual address, and a payload at a local machine;

receiving the remote direct memory access packet at a virtual network interface card on the local machine;
reconstructing a stream identifier by separating the opaque data into the encrypted stream identifier and the digest;
decrypting the encrypted stream identifier;
verifying the decrypted stream identifier using the first digest;
providing the verified stream identifier to a system memory management unit; and
mapping the virtual address and the provided stream identifier by the system memory management unit to a physical address,
into which to write the payload.

US Pat. No. 9,602,282

SECURE SOFTWARE AND HARDWARE ASSOCIATION TECHNIQUE

Cavium, Inc., San Jose, ...

1. A method for authenticating and associating a program code with an equipment, the method comprising:
associating critical security information with an original equipment manufacturer (OEM) of the equipment by encrypting the
critical security information using a unique secret value, the unique secret value identifying the OEM of the equipment associated
with the critical security information, the critical security information including a device authentication key, a chip encryption
key, and an image authentication key;
loading the critical security information associated with the OEM of the equipment from a memory at an initial startup time;
retrieving the chip encryption key and the image authentication key stored in the critical security information associated
with the OEM of the equipment in the memory by decrypting the critical security information using the unique secret value;

authenticating the program code using the chip encryption key and the image authentication key; and
transferring ownership of the equipment to a new owner by updating the device authentication key of the critical security
information with at least one public key of the new owner.

US Pat. No. 9,553,819

SYSTEMS AND METHODS FOR TIMING ADJUSTMENT OF METADATA PATHS IN A NETWORK SWITCH UNDER TIMING CONSTRAINTS

CAVIUM, INC., San Jose, ...

1. A computer-implemented system including a processor executing instructions stored in a storage medium to support automatic
timing adjustment of metadata paths in a network switch, comprising:
a path identification engine running on a host and configured to identify a plurality of metadata paths in the network switch,
wherein each of the metadata paths carries a piece of metadata of the incoming packet from one component in the network switch
to another component in the network switch;

a constraint generation engine running on a host and configured to generate a plurality of timing constraints for each of
the metadata paths in the network switch, wherein the timing constraints need to be met in order for the network switch to
function properly;

a path timing optimization engine running on a host and configured to:
calculate current path delays of each of the identified metadata paths, wherein the path delay is the time to carry the piece
of metadata from one component to another via the metadata path;

determine optimal timing values of each of the metadata paths to meet the timing constraints;
compare the optimal timing values of the metadata paths to the current timing delays of the paths to identify one or more
metadata paths which current delay values do not meet the timing constraints;

adjust the delays of the identified one or more metadata paths in the network switch to meet the timing constraints at minimum
cost.

US Pat. No. 9,529,773

SYSTEMS AND METHODS FOR ENABLING ACCESS TO EXTENSIBLE REMOTE STORAGE OVER A NETWORK AS LOCAL STORAGE VIA A LOGICAL STORAGE CONTROLLER

CAVIUM, INC., San Jose, ...

1. A system to support elastic network storage, comprising:
a logical storage controller of a local network interface card/controller (NIC), configured to:
accept a request for storage space from one of a plurality of virtual machines (VMs) running on a host;
allocate storage volumes on one or more remote storage devices accessible over a network fabric in accordance with the request
for the storage space;

create and map one or more logical volumes in one or more logical namespaces to the storage volumes on the remote storage
devices;

present the logical volumes mapped to the storage volumes on the remote storage devices as local storage volumes to the VM
requesting the storage space;

enable the VM to perform a read/write operation on the logical volumes via a first instruction.

US Pat. No. 9,513,926

FLOATING MASK GENERATION FOR NETWORK PACKET FLOW

Cavium, Inc., San Jose, ...

1. A tag mask generation method comprising:
receiving a section_selector flag indicating whether a tag mask comprising arbitrary number of bits for at least one section
of a network packet is to be generated;

receiving from a parser a parse information for the network packet, wherein the parse information includes a section_pointer
that indicates a location of the section in the network packet;

generating a pointer based on the section_pointer when the section_selector indicates that the tag mask for the section is
to be generated;

receiving a base mask for the section; and
generating the tag mask via a shifter by shifting the base mask by the amount indicated by the pointer.

US Pat. No. 9,501,425

TRANSLATION LOOKASIDE BUFFER MANAGEMENT

Cavium, Inc., San Jose, ...

1. A method for managing a plurality of translation lookaside buffers, each translation lookaside buffer being associated
with a corresponding processing element of a plurality of processing elements, the method comprising:
issuing a first translation lookaside buffer invalidation instruction at a first processing element of the plurality of processing
elements, and sending the first translation lookaside buffer invalidation instruction to a second processing element of the
plurality of processing elements;

receiving translation lookaside buffer invalidation instructions, including the first translation lookaside buffer invalidation
instruction, at the second processing element;

issuing an element-specific synchronization instruction at the first processing element, and broadcasting a synchronization
command to multiple processing elements, the element-specific synchronization instruction being issued without being broadcast
to multiple processing elements and the element-specific synchronization instruction preventing issuance of additional translation
lookaside buffer invalidation instructions at the first processing element until an acknowledgement in response to the synchronization
command is received at the first processing element;

receiving the synchronization command at the second processing element; and
after completion of any translation lookaside buffer invalidation instructions, including the first translation lookaside
buffer invalidation instruction, issued at the second processing element before the synchronization command was received at
the second processing element, sending the acknowledgement from the second processing element to the first processing element,
the acknowledgement indicating that any translation lookaside buffer invalidation instructions, including the first translation
lookaside buffer invalidation instruction, issued at the second processing element before the synchronization command was
received at the second processing element are complete.

US Pat. No. 9,507,369

DYNAMICALLY ADJUSTING SUPPLY VOLTAGE BASED ON MONITORED CHIP TEMPERATURE

Cavium, Inc., San Jose, ...

5. Apparatus comprising:
a temperature sensor for monitoring a temperature of a semiconductor chip;
a controller configured to adjust a supply voltage to the semiconductor chip by increasing the supply voltage as a continuous
function of the monitored temperature decreasing, wherein the controller is configured to adjust the supply voltage only if
the monitored temperature is below a threshold temperature and the supply voltage adjusted is determined based on a linear
relationship having a negative slope between the supply voltage and the monitored temperature as defined by the continuous
function.

US Pat. No. 9,432,288

SYSTEM ON CHIP LINK LAYER PROTOCOL

Cavium, Inc., San Jose, ...

1. A method comprising:
generating a data message at a first system-on-chip (SOC) for transmission to a second SOC, the first and second SOCs each
including a cache and a plurality of processing cores;

associating the data message with one of a plurality of virtual channels;
generating a data block to include data associated with each of the plurality of virtual channels, the data block including
at least a portion of the data message;

distributing segments of the data block across a plurality of output ports at the first SOC; and
transmitting the data block to the second SOC via the plurality of output ports.

US Pat. No. 9,306,584

MULTI-FUNCTION DELAY LOCKED LOOP

Cavium, Inc., San Jose, ...

8. An interface circuit comprising:
a plurality of blocks, each block receiving a clock signal and a respective outbound data signal, each block comprising:
a delay line configured to receive the clock signal and output a delayed clock signal;
a delay controller configured to control the delay line to output the delayed clock signal at a quadrature delay relative
to the clock signal;

a multiplexer receiving a plurality of delay signals, the delay signals including the clock signal and the delayed clock signals;
at least one flip-flop configured to receive the respective outbound data signal, the at least one flip-flop being clocked
by the multiplexer output; and

a state machine configured to control the multiplexer at each of the blocks to select a delay signal to provide signal leveling
among the respective outbound data signals.

US Pat. No. 9,639,476

MERGED TLB STRUCTURE FOR MULTIPLE SEQUENTIAL ADDRESS TRANSLATIONS

CAVIUM, INC., San Jose, ...

1. A circuit comprising:
a cache configured to store translations between address domains, the cache addressable as a first logical portion and a second
logical portion, the first logical portion configured to store translations between a first address domain and a second address
domain, the second logical portion configured to store translations between the second address domain and a third address
domain;

a processor configured to match an address request against the cache and output a corresponding address result; and
a register configured to define a boundary between the first and second logical portions, the boundary indicating a location
within the cache, the location being defined by a value stored at the register.

US Pat. No. 9,612,934

NETWORK PROCESSOR WITH DISTRIBUTED TRACE BUFFERS

Cavium, Inc., San Jose, ...

1. A system comprising:
a cache; and
a plurality of processor subsets configured to access the cache, each processor subset comprising:
a group of processors;
a bus, the groups connected to the cache via the respective bus, the bus carrying commands from the group of processors to
the cache, the bus further carrying data between the cache and the processors; and

a trace buffer connected to the bus between the group of processors and the cache, the trace buffer configured to store information
regarding commands to access the cache sent by the group of processors along the bus to the cache, the information including
address information, command information and command time information;

the trace buffers at each of the processor subsets sharing a common address space to enable access to the trace buffers as
a single entity.

US Pat. No. 9,614,762

WORK MIGRATION IN A PROCESSOR

Cavium, Inc., San Jose, ...

1. An apparatus for processing a packet comprising:
a plurality of clusters, each cluster including a plurality of processors for processing lookup requests and a local memory
storing a set of rules;

at least one of the plurality of clusters being configured to:
generate a work product associated with one of the lookup requests, the work product corresponding to a process of rule-matching
at least one field of a packet associated with the lookup request;

determine whether to forward the work product to another of the plurality of clusters; and
based on the determination, forward the work product to another of the plurality of clusters.

US Pat. No. 9,568,944

DISTRIBUTED TIMER SUBSYSTEM ACROSS MULTIPLE DEVICES

CAVIUM, INC., San Jose, ...

1. An apparatus comprising:
a first silicon device configured in accordance with an Advanced RISC Machines™ (ARM) architecture, the first silicon device
having a timer and at least one processing element;

a second silicon device configured in accordance with the ARM architecture, the second silicon device having a timer and at
least one processing element;

an interconnect coupled between the first silicon device and the second silicon device such that both the first and second
silicon devices have access to any of the processing elements disposed on both the first and second silicon devices; and

at least one of the first and second silicon devices configured to determine an offset between the timer of the first silicon
device and the timer of the second silicon device in accordance with a total delay of a first synchronization message transmitted
from the first silicon device to the second silicon device and a total delay of a second time synchronization message transmitted
from the second silicon device to the first silicon device.

US Pat. No. 9,542,342

SMART HOLDING REGISTERS TO ENABLE MULTIPLE REGISTER ACCESSES

Cavium, Inc., San Jose, ...

1. A processor comprising:
target registers;
N holding registers, wherein each of the N holding registers is associated with a source and is configured to refrain from
pushing any subsets of an update received from the source to one of the target registers until all of the subsets of the update
have been received from the source; and

a bus coupling the target registers and the N holding registers, wherein when the bus is accessed by one of the holding registers,
the bus includes a source identifier indicating the one of the N holding registers that the access is from.

US Pat. No. 9,529,640

WORK REQUEST PROCESSOR

Cavium, Inc., San Jose, ...

1. scheduling processor for scheduling work for a plurality of processors, the scheduling processor comprising:
an add work engine (AWE) configured to forward a work queue entry (WQE) to one of a plurality of input queues (IQs);
an on-deck unit (ODU) comprising a table having a plurality of entries, each entry storing a respective WQE; and a plurality
of lists, each of the lists being associated with a respective processor configured to execute WQEs and comprising a plurality
of pointers to entries in the table, each of the lists adding a pointer based on an indication of whether the associated processor
accepts the WQE corresponding to the pointer; and

a get work engine (GWE) configured to move WQEs from the plurality of IQs to the table of the ODU.

US Pat. No. 9,516,145

METHOD OF EXTRACTING DATA FROM PACKETS AND AN APPARATUS THEREOF

Cavium, Inc., San Jose, ...

1. A method of implementing a parser engine, the method comprising:
identifying one or more protocol layers of a packet, wherein each of the protocol layers have one or more fields;
for each protocol layer of the protocol layers, expanding the protocol layer to a generic format having a predetermined number
of fields based on the identification of the protocol layer thereby forming an expanded protocol layer; and

selecting contents from each of the expanded protocol layers to thereby form a final token.

US Pat. No. 9,485,179

APPARATUS AND METHOD FOR SCALABLE AND FLEXIBLE TABLE SEARCH IN A NETWORK SWITCH

Cavium, Inc., San Jose, ...

1. A network switch to support scalable and flexible table search, comprising:
a packet processing pipeline including a plurality of packet processing clusters configured to process a received packet through
multiple packet processing stages based on table search/lookup results;

a plurality of search logic units each corresponding one of the plurality of packet processing clusters, wherein each of the
search logic units is configured to:

convert a unified table search request from its corresponding packet processing cluster to a plurality table search commands
specific to one or more memory clusters that maintain a plurality of tables;

provide the plurality table search commands specific to the memory clusters in parallel;
collect and provide the table search results from the memory clusters to the corresponding packet processing cluster;
said one or more memory clusters configured to:
maintain the tables to be searched;
search the tables in parallel according to the plurality table search commands from the search logic unit;
process and provide the table search results to the search logic unit;
wherein each of the memory clusters includes one or more of static random-access memory (SRAM) pools and ternary content-addressable
memory (TCAM) pools, wherein each of the SRAM pools includes a plurality of memory pairs each having a hash function circuitry,
two memory tiles, and a data processing circuitry and each of the TCAM pools includes a plurality of TCAM databases each having
a plurality of TCAM tiles and SRAM tiles.

US Pat. No. 9,470,719

TESTING SEMICONDUCTOR DEVICES

Cavium, Inc., San Jose, ...

1. An apparatus, comprising:
a plurality of semiconductor devices;
an electrical input device for applying voltage to said plurality of semiconductor devices;
a switching array configured to sequentially interconnect said electrical input device to each of said plurality of semiconductor
devices and disconnect the other semiconductor devices from said electrical input device, a semiconductor device connected
to the electrical input device being a device under test that produces a test current and the other semiconductor devices
being devices not under test that, in the aggregate, produce a leakage current;

an output node interconnected to the switching array for enabling the measurement of the test current at the output node;
and

a leakage current compensator connected to the output node and the switching array, the leakage current compensator configured
to divert the leakage current away from the output node.

US Pat. No. 9,137,340

INCREMENTAL UPDATE

Cavium, Inc., San Jose, ...

1. A method comprising:
receiving an incremental update for a Rule Compiled Data Structure (RCDS), the RCDS representing a set of rules for packet
classification, the RCDS utilized for packet classification by an active search process;

maintaining a housekeeping tree, an augmented representation of the RCDS including additional information of the RCDS for
determining updates for the RCDS;

using the housekeeping tree to create a change list; and
atomically updating the RCDS based on the incremental update received, the change list created for atomically updating the
RCDS from the perspective of the active search process utilizing the RCDS.

US Pat. No. 9,813,342

METHOD AND SYSTEM FOR IMPROVED LOAD BALANCING OF RECEIVED NETWORK TRAFFIC

Cavium, Inc., San Jose, ...

1. A method for load balancing of a received packet based network traffic, comprising:
receiving a packet at a network interface;
determining a physical port identifier in accordance with a physical port receiving the packet;
providing the packet and the determined physical port identifier to a software defined network switch;
determining information pertaining to uniqueness of a packet flow for the received packet by parsing at least one layer of
the packet in accordance with rules of the software defined network switch and determining a tag from the at least one parsed
layer in accordance with the rules;

determining that the received packet comprises a non-standard packet structure upon the physical port identifier identifying
one of a first set of physical ports;

providing the tag together with the received packet to a network interface controller when the received packet comprises the
non-standard packet structure;

providing at least the received packet to the network interface controller when the received packet comprises a standard packet
structure upon the physical port identifier identifying one of a second set of physical ports; and

processing the received packet at the network interface controller in accordance with at least one of the provided tag and
the packet structure; wherein the received packet comprising the standard packet structure is processed in accordance with
a receive side scaling.

US Pat. No. 9,531,848

METHOD OF USING GENERIC MODIFICATION INSTRUCTIONS TO ENABLE FLEXIBLE MODIFICATIONS OF PACKETS AND AN APPARATUS THEREOF

Cavium, Inc., San Jose, ...

9. A method of a network switch, the method comprising:
maintaining a set of generic commands in a memory of the network switch;
receiving a packet at an incoming port of the network switch;
generalizing each protocol header of the packet according to a generic format for the protocol header, wherein each generalized
protocol header includes a bit vector with bits marked as a first value for invalid fields and bits marked as a second value
for valid fields;

modifying at least one of the generalized protocol headers by applying at least one command from the set of generic commands
to the generalized protocol header, thereby updating the bit vector;

forming a new protocol header based on the updated bit vector; and
transmitting the packet with the new protocol header via an outgoing port of the network switch.

US Pat. No. 9,961,167

METHOD OF MODIFYING PACKETS TO A GENERIC FORMAT FOR ENABLING PROGRAMMABLE MODIFICATIONS AND AN APPARATUS THEREOF

Cavium, Inc., San Jose, ...

1. A method for protocol layer expansion using a rewrite engine of a network switch, the method comprising:receiving a protocol layer of a header of an incoming packet with the network switch, the protocol layer formatted according to a header layer protocol having one or more supported fields;
detecting any of the one or more supported fields that are missing from the protocol layer with the network switch; and
based on the any of the one or more supported fields that are detected as being missing, expanding the protocol layer to a generic format of the header layer protocol with the network switch such that the expanded protocol layer of the header includes the any of the one or more supported fields that are missing filled with generic data, wherein the generic data is independent of the incoming packet.

US Pat. No. 9,602,532

METHOD AND APPARATUS FOR OPTIMIZING FINITE AUTOMATA PROCESSING

Cavium, Inc., San Jose, ...

1. A security appliance operatively coupled to a network, the security appliance comprising:
at least one memory configured to store at least one finite automaton including a plurality of nodes generated from at least
one regular expression pattern;

at least one processor operatively coupled to the at least one memory and configured to walk the at least one finite automaton,
with segments of an input stream received via the network, to match the at least one regular expression pattern in the input
stream, the walk including iteratively walking at least two nodes of a given finite automaton, of the at least one finite
automaton, in parallel, with a segment, at a current offset within a payload, of a packet in the input stream, based on positively
matching the segment at a given node of the at least two nodes walked in parallel, the current offset being updated to a next
offset per iteration.

US Pat. No. 9,531,647

MULTI-HOST PROCESSING

Cavium, Inc., San Jose, ...

1. An apparatus for processing a packet comprising:
a packet processor configured to operate rule matching for packets received from a plurality of hosts; and
an input processor comprising:
a queue register configured to receive lookup requests from the plurality of hosts, each lookup request corresponding to a
packet;

a payload header extractor (PHE) configured to 1) extract a host identifier from each lookup request and store the host identifier
extracted from each lookup request, the host identifier indicating the one of the plurality of hosts originating the lookup
request, the lookup request indicating a request to determine a path to forward the packet on a network, and 2) generate at
least one key request for each lookup request, the at least one key request including a key for comparing against a set of
rules to determine the path to forward the packet on the network;

a request counter configured to maintain a request count of the number of lookup requests per host in the queue register based
on the host identifier of each lookup request;

a queue manager configured to compare the request count for each host against a respective input threshold, the queue manager
preventing receipt of additional lookup requests from a given host to the queue register in response to the request count
for the given host exceeding the respective input threshold; and

a scheduler output manager configured to forward the at least one key request for each of the lookup requests from the queue
register to the packet processor.

US Pat. No. 9,529,532

METHOD AND APPARATUS FOR MEMORY ALLOCATION IN A MULTI-NODE SYSTEM

Cavium, Inc., San Jose, ...

1. A multi-chip system comprising:
multiple chip devices, a first chip device of the multiple chip devices includes a memory allocator (MA) hardware component;
and

one or more free-pool allocator (FPA) coprocessors, each associated with a corresponding chip device, and each configured
to manage a corresponding list of pools of free-buffer pointers,

the MA hardware component configured to:
allocate a free buffer, associated with a chip device of the multiple chip devices, to data associated with a work item based
on the one or more lists of free-buffer pointers managed by the one or more FPA coprocessors.

US Pat. No. 9,531,849

METHOD OF SPLITTING A PACKET INTO INDIVIDUAL LAYERS FOR MODIFICATION AND INTELLIGENTLY STITCHING LAYERS BACK TOGETHER AFTER MODIFICATION AND AN APPARATUS THEREOF

Cavium, Inc., San Jose, ...

1. A method of a rewrite engine, the method comprising:
maintaining a pointer structure for a packet, the packet having a header including a plurality of protocol layers and a body,
wherein the body of the header is a portion of the header that is not modified by the rewrite engine, and further wherein
the pointer structure includes an end pointer, a plurality of layer pointers and a total size of the header of the packet,
wherein each of the layer pointers point to a start position of a different one of the protocol layers of the header of the
packet and the end pointer points to the start of the body of the header;

splitting the layers of the packet based on the layer pointers for layer modifications;
updating the layer pointers based on layer modifications;
stitching back together the layers based on the updated layer pointers thereby forming a new protocol layer stack; and
transmitting the packet with the new protocol layer stack to an output port.

US Pat. No. 9,431,105

METHOD AND APPARATUS FOR MEMORY ACCESS MANAGEMENT

Cavium, Inc., San Jose, ...

1. A method comprising:
receiving requests for access to a memory from one or more devices, each particular request associated with one of a plurality
of virtual channels;

for each one of the plurality of virtual channels, maintaining a corresponding linked list;
for each particular request received, assigning a tag and adding the assigned tag to the linked list corresponding to the
virtual channel associated with the particular request received;

transmitting each request received with the assigned tag to the memory;
receiving responses to the requests from the memory, each response having an associated tag; and
transmitting the responses received to the one or more devices including comparing the tags of the responses received with
a top of lists state indicating which tags are at the top of the corresponding linked lists and transmitting those responses
received for which the comparison indicates a match.

US Pat. No. 9,813,327

HIERARCHICAL HARDWARE LINKED LIST APPROACH FOR MULTICAST REPLICATION ENGINE IN A NETWORK ASIC

Cavium, Inc., San Jose, ...

1. A network switching device comprising:
a memory;
a replication table stored in the memory and including a multicast rule that is represented in a hierarchical linked list
with N tiers, wherein each node in the hierarchical linked list is stored as an entry in the replication table and at least
one of the entries comprises one or more rule values indicating:

whether a copy of a packet is made: and
how to modify the copy relative to an original; and
a multicast replication engine that replicates a packet according to the multicast rule.

US Pat. No. 9,502,099

MANAGING SKEW IN DATA SIGNALS WITH MULTIPLE MODES

Cavium, Inc., San Jose, ...

1. An apparatus for controlling a memory, said apparatus comprising: a memory controller, and an interface to data lines connecting
said memory controller to said memory, wherein each of said data lines carries a signal that corresponds to a bit of a byte
that is to be written to said memory, wherein said interface comprises, for each of said data lines, a transmitter to drive
bits to be written into said memory onto a data line coupled to the transmitter, wherein said interface comprises, for each
of said data lines, a receiver to detect bits that have been read from said memory from a data line coupled to the receiver,
wherein said memory controller comprises, for each of said data lines, a data de-skewer used in a writing mode and a reading
mode, wherein for each of said data lines, said data de-skewer is configured to receive a first data signal, wherein each
of said data lines is associated with an inherent skew, wherein said data de-skewer applies a compensation skew to said first
data signal to generate a second data signal, wherein, in the writing mode said second data signal is a signal that represents
a bit that is to be written into said memory, and in the reading mode said second data signal is a signal that represents
a bit that has been read from said memory after being de-skewed, and wherein, for at least a first data line of said data
lines, said data de-skewer is configured to apply a particular compensation skew using a particular delay line coupled to
the first data line to write data in the writing mode, after that particular compensation skew has been obtained by training
said data de-skewer in the reading mode using the same particular delay line coupled to the first data line.

US Pat. No. 9,496,012

METHOD AND APPARATUS FOR REFERENCE VOLTAGE CALIBRATION IN A SINGLE-ENDED RECEIVER

Cavium, Inc., San Jose, ...

1. A method of calibrating a reference voltage of a single-ended receiver, the method comprising:
applying a clock signal and a reference voltage signal as inputs to a differential amplifier of the single-ended receiver;
evaluating an indication of a duty cycle associated with an output signal of the differential amplifier;
adjusting, by control logic configured to apply a binary search, a level of the reference voltage signal based on the evaluated
indication of the duty cycle; and

repeating said evaluating and said adjusting over a number of iterations.

US Pat. No. 9,426,165

METHOD AND APPARATUS FOR COMPILATION OF FINITE AUTOMATA

Cavium, Inc., San Jose, ...

1. A security appliance operatively coupled to a network, the security appliance comprising:
at least one memory and at least one network interface;
at least one processor operatively coupled to the at least one memory and the at least one network interface, the at least
one processor configured to:

select a subpattern from each pattern in a set of one or more regular expression patterns based on at least one heuristic;
generate a unified deterministic finite automata (DFA) using the subpatterns selected from all patterns in the set;
generate at least one non-deterministic finite automata (NFA) for at least one pattern in the set, a portion of the at least
one pattern used for generating the at least one NFA, and at least one walk direction selected from a reverse and forward
walk direction for run time processing of the at least one NFA, being determined based on whether a length of the subpattern
selected from the at least one pattern is fixed or variable and a location of the subpattern selected within the at least
one pattern; and

store the unified DFA and the at least one NFA generated in the at least one memory for run time processing by the at least
one processor with a payload received via the at least one network interface, to determine pattern matches in the payload
prior to forwarding the payload, the subpatterns selected based on the at least one heuristic to minimize a number of false
positives identified in the at least one NFA to reduce the run time processing of the at least one processor.

US Pat. No. 9,355,206

SYSTEM AND METHOD FOR AUTOMATED FUNCTIONAL COVERAGE GENERATION AND MANAGEMENT FOR IC DESIGN PROTOCOLS

CAVIUM, INC., San Jose, ...

1. A system to support automatic functional coverage generation and management for an integrated circuit (IC) design protocol,
comprising:
a specification generation engine running on a host, which in operation, is configured to automatically generate one or more
specifications for functional coverage of the IC design protocol based on inputs from an architect of the IC design protocol;

a coverage validation engine running on a host, which in operation, is configured to validate the one or more specifications
for the functional coverage and generate coverage data on reachable states at formal verification (FV) level;

a coverage data collection engine running on a host, which in operation, is configured to conduct simulation of the IC design
protocol and collect coverage data of reached coverage points at register-transfer level (RTL); and

a coverage data analysis engine running on a host, which in operation, is configured to analyze and verify completeness of
the functional coverage of the IC design protocol based on the coverage data collected at the formal verification level and
at the RTL, respectively;

wherein the IC design protocol is a directory-based cache coherence protocol, which connects a plurality of System-on-Chips
(SOCs) as a single multicore processor via connections among the SOCs.

US Pat. No. 9,059,945

WORK REQUEST PROCESSOR

Cavium, Inc., San Jose, ...

1. A system including a scheduling processor configured to schedule work for a plurality of processors, the scheduling processor
comprising:
an add work engine (AWE) configured to forward a work queue entry (WQE) to one of a plurality of input queues (IQs);
an on-deck unit (ODU) comprising a memory storing a table and a plurality of lists, the table having a plurality of entries,
each entry storing a respective WQE; each of the plurality of lists being associated with a respective one of the plurality
of processors configured to execute WQEs and comprising a plurality of pointers to entries in the table, each of the lists
adding a pointer based on an indication of whether the associated processor accepts the WQE corresponding to the pointer;
and

a get work engine (GWE) configured to move WQEs from the plurality of IQs to the table of the ODU;
wherein the indication is based on one or more of: a work group corresponding to the WQE, a comparison of a priority of the
WQE against a priority of other WQEs stored at the list, and an identifier of the IQ storing the WQE.

US Pat. No. 9,864,583

ALGORITHM TO DERIVE LOGIC EXPRESSION TO SELECT EXECUTION BLOCKS FOR PROGRAMMABLE NETWORK DEVICES

Cavium, Inc., San Jose, ...

1. A processing network for receiving a source code having one or more code paths that are each associated with one or more
conditions and one or more assignments of the source code, the network comprising:
a plurality of processing elements on a programmable microchip, wherein each of the processing elements have one or more instruction
tables each including one or more blocks, wherein each of the code paths is associated with a block address for each of the
blocks;

a plurality of on-chip routers on the microchip for routing the data between the processing elements, wherein each of the
on-chip routers is communicatively coupled with one or more of the processing elements; and

a compiler stored on a non-transitory computer-readable memory and comprising a logic generator that generates a bit expression
for each bit of the block address for each of the blocks, wherein the bit expression is a logical combination of the conditions
associated with selected code paths of the code paths whose block addresses for the block have a value indicating use of the
bit associated with the bit expression.

US Pat. No. 9,729,447

APPARATUS AND METHOD FOR PROCESSING ALTERNATELY CONFIGURED LONGEST PREFIX MATCH TABLES

Cavium, Inc., San Jose, ...

1. A network switch, comprising:
a memory configurable to store alternate table representations of an individual trie in a hierarchy of tries, wherein the
alternate table representations include a sparse mode representation that identifies selected trie nodes, a bit map mode representation
with a bit map that identifies selected trie nodes, and a leaf-push representation that identifies selected trie nodes at
the bottom of a trie;

a hardware prefix table processor to
access in parallel, using an input network address, the alternate table representations of the individual trie and search
for a longest prefix match in each alternate table representation to obtain local prefix matches, and

select the longest prefix match from the local prefix matches, wherein the longest prefix match has an associated next hop
index base address and offset value.

US Pat. No. 9,690,590

FLEXIBLE INSTRUCTION EXECUTION IN A PROCESSOR PIPELINE

CAVIUM, INC., San Jose, ...

1. A method for executing instructions in a processor, the method comprising:
selecting or more instructions to be issued together in the same clock cycle of the processor from among a plurality of instructions,
the selected one or more instructions occurring consecutively according to a program order; and

executing instructions that have been issued, through multiple execution stages of a pipeline of the processor, the executing
including:

determining a delay assigned to a first instruction, and
sending a result of a first operation performed by the first instruction in a first execution stage to a second execution
stage, where the number of execution stages between the first execution stage and the second execution stage is based on the
determined delay.

US Pat. No. 9,684,606

TRANSLATION LOOKASIDE BUFFER INVALIDATION SUPPRESSION

CAVIUM, INC., San Jose, ...

1. A method for managing a plurality of translation lookaside buffers, each translation lookaside buffer including a plurality
of translation lookaside buffer entries and being associated with a corresponding processing element of a plurality of processing
elements, the method comprising:
issuing, at a first processing element of the plurality of processing elements, a first instruction for invalidating one or
more translation lookaside buffer entries associated with a first context in a first translation lookaside buffer associated
with the first processing element, the issuing including:

determining, at the first processing element, whether or not a state of an indicator indicates that all translation lookaside
buffer entries associated with the first context in a second translation lookaside buffer associated with a second processing
element are invalidated;

if the state of the indicator indicates that all translation lookaside buffer entries associated with the first context in
the second translation lookaside buffer are not invalidated:

sending a corresponding instruction to the second processing element of the plurality of processing elements, the corresponding
instruction causing invalidation of all translation lookaside buffer entries associated with the first context in the second
translation lookaside buffer while maintaining one or more translation lookaside buffer entries associated with one or more
other contexts in the second translation lookaside buffer, and

changing a state of the indicator to indicate that all translation lookaside buffer entries associated with the first context
in the second translation lookaside buffer are invalidated; and

if the state of the indicator indicates that all translation lookaside buffer entries associated with the first context in
the second translation lookaside buffer associated with the second processing element are invalidated:

suppressing sending of any corresponding instructions for causing invalidation of any translation lookaside buffer entries
associated with the first context in the second translation lookaside buffer to the second processing element.

US Pat. No. 9,497,294

METHOD OF USING A UNIQUE PACKET IDENTIFIER TO IDENTIFY STRUCTURE OF A PACKET AND AN APPARATUS THEREOF

CAVIUM, INC., San Jose, ...

1. A network switch comprising:
an input port for receiving packets;
a packet generalization scheme that maintains information across a plurality of combination of protocol layers of headers
of packets, wherein the information is maintained in a protocol table stored in a memory of the network switch, and further
wherein each of the protocol layer combinations is associated with a different unique identifier and the information is indexed
within the protocol table according to the unique identifiers;

a rewrite engine that uses the unique identifier of a packet of the packets as a key to the protocol table to access the information
for each protocol layer of the protocol layer combination of the header of the packet that the rewrite engine requires during
modification of the packet, wherein the rewrite engine expands one or more of the protocol layers based on the information
associated with the protocol layer within the protocol table, wherein the expanding includes identifying fields of each of
the one or more of the protocol layers; and

an output port for outputting the packets.

US Pat. No. 9,570,128

MANAGING SKEW IN DATA SIGNALS

Cavium, Inc., San Jose, ...

1. An apparatus for controlling a memory, said apparatus comprising: a memory controller, mode-dependent circuitry capable
of switching between a writing mode and a reading mode and whose function depends on whether it is in said writing mode or
said reading mode, and an interface to data lines connecting said memory controller to said memory, wherein each of said data
lines carries a signal that corresponds to a bit of a byte that is to be written to said memory, wherein said interface comprises,
for each of said data lines, circuitry for transmission of a bit to be written into said memory via said data line, wherein
said interface comprises, for each of said data lines, a data de-skewer, wherein for each of said data lines, said data de-skewer
is configured to receive a first data signal provided by said mode-dependent circuitry in a writing mode, wherein said first
data signal is a signal that represents a bit that is to be written into said memory, wherein each of said data lines is associated
with an inherent skew, wherein said data de-skewer applies a compensation skew to said first data signal to generate a second
data signal in said writing mode, wherein an extent of said compensation skew is selected to increase a likelihood that said
second data signal will be sampled by said memory during a data-valid window thereof, and wherein said data de-skewer is further
configured to receive a first data bit from said mode-dependent circuitry that, in a reading mode, is connected to a portion
of said interface configured to receive from said data line a signal indicative of a bit that has been read from said memory
and configured to skew said first data bit in said reading mode.

US Pat. No. 9,559,982

PACKET SHAPING IN A NETWORK PROCESSOR

Cavium, Inc., San Jose, ...

1. A circuit for managing transmittal of packets, the circuit comprising:
a packet descriptor manager (PDM) circuit module configured to generate a metapacket from a command signal, the metapacket
indicating a size and a destination of a packet to be transmitted by the circuit, the metapacket including an entry stating
the size of the packet;

a packet scheduling engine (PSE) circuit module configured to compare a packet transmission rate associated with the packet
against at least one of a peak rate and a committed rate associated with the packet, the PSE determining an order in which
to transmit the packet among a plurality of packets based on the comparison; and

a packet engines and buffering (PEB) circuit module configured to process the packet and cause a processed packet to be transmitted
toward the destination according to the order determined by the PSE;

wherein the PSE is further configured to compare, for a plurality of nodes in a path between the circuit and the destination,
a packet transmission rate associated with the node against at least one of a peak rate and a committed rate associated with
the node, the PSE determining the order based on the comparisons.

US Pat. No. 9,465,662

PROCESSOR WITH EFFICIENT WORK QUEUING

Cavium, Inc., San Jose, ...

1. A network services processor comprising:
a plurality of network services processor elements that perform work comprising packet processing operations;
a plurality of in-memory linked-lists arranged to store entries indicating work to be performed by the network services processor
elements; and

a scheduling processor configured to schedule the work for the plurality of network services processor elements, the scheduling
processor further configured to 1) detect availability of a processor to perform the work and 2) store the entries to the
plurality of in-memory linked-lists in response to detecting a lack of an available processor to perform the work in the plurality
of network services processor elements, and the scheduling processor moving the entries back from the in-memory linked-lists
to a given network services processor element of the plurality of network services processor elements in response to detecting
availability of a processor to perform the stored work in the plurality of network services processor elements,

wherein the work to be performed by the network services processor is unpacked upon being moved back from the in-memory linked-lists
to the network services processor.

US Pat. No. 9,404,970

DEBUG INTERFACE FOR MULTIPLE CPU CORES

CAVIUM, INC., San Jose, ...

1. A system, comprising:
resources within processor cores that are each in communication with a debug bus such that the cores receive packets over
the debug bus,

the cores executing transactions in response to the packets,
the packets each being a type of packet selected from a group that includes Second Access Bus (SAB) packets and Debug Access
Bus (DAB) packets,

the resources including specified resources and non-specified resources,
a core that executes a transaction in response to a DAB packet accesses a specified resource and a core that executes a transaction
in response to a SAB packet accesses a non-specified resource,

a debug specification identifies the specified resources as being accessible by a debug controller but does not identify the
non-specified resources as being accessible by the debug controller.

US Pat. No. 9,335,784

CLOCK DISTRIBUTION CIRCUIT WITH DISTRIBUTED DELAY LOCKED LOOP

Cavium, Inc., San Jose, ...

1. A clock distribution circuit comprising:
a global delay locked loop (DLL) configured to receive a global clock input signal, a lead/lag input signal and to output
a clock signal;

a plurality of clock distribution blocks, each clock distribution block configured to receive the output of the global DLL,
a lead/lag input signal and to output a leaf node clock signal, wherein each clock distribution block further comprises a
local DLL;

wherein the global DLL is further configured to align one of the plurality of leaf node clock signals, output by one of the
plurality of clock distribution blocks, to a reference clock, the lead/lag input signal of the global DLL connected to the
lead/lag input signal of the one of the plurality of clock distribution blocks;

wherein each clock distribution block is further configured to align its leaf node clock signal to the reference clock based
on its lead/lag input signal.

US Pat. No. 9,065,626

BIT ERROR RATE IMPACT REDUCTION

Cavium, Inc., San Jose, ...

1. A method comprising:
receiving at a data interface a data stream having a plurality of logical communication channels, the data stream including
in succession a first data burst corresponding to one of the plurality of logical communication channels, a burst control
word and a second data burst corresponding to the one or an other of the plurality of logical communication channels, the
burst control word including a first error check that protects the first data burst and the burst control word and a second
error check that protects only the burst control word;

examining the first error check and the second error check;
erroring out only the one logical communication channel if the first error check is bad and the second error check is good;
erroring out all open logical communication channels if the first error check is bad and the second error check is bad.

US Pat. No. 10,116,564

HYBRID WILDCARD MATCH TABLE

Cavium, Inc., San Jose, ...

1. A network switch comprising:a plurality of static random access memory (SRAM) pools storing hashable entries, wherein the SRAM pools are each associated with one or more hashing methods;
at least one spillover ternary content addressable memory (TCAM) pool storing unhashable entries that spillover from the SRAM pools, wherein the unhashable entries are entries that are unable to be hashed into the SRAM pools according to the hashing methods; and
a request interface control logic dispatching a search key for a packet to one or more active pools of the SRAM pools and the at least one TCAM pool and returning results data, wherein when one or more of the SRAM pools are active for the search key, SRAM tiles of each of the one or more of the SRAM pools return tile results, and for each of the one or more of the SRAM pools, a first level arbitration is performed between the tile results of the SRAM tile of that SRAM pool based on a priority of the tile results, and further wherein the network switch performs an action on the packet based on the returned results data.

US Pat. No. 9,858,222

REGISTER ACCESS CONTROL AMONG MULTIPLE DEVICES

Cavium, Inc., San Jose, ...

1. A circuit comprising:
a first plurality of ports for connecting to a plurality of on-chip devices via respective buses, a first bus of the respective
buses configured to carry on-chip access requests, a second bus of the respective buses configured to carry off-chip access
requests;

a second plurality of ports connecting to a master bus, the master bus further connecting to a control status register (CSR)
and an off-chip device; and

a control circuit configured to:
detect a completion status of a first off-chip access request received from the second bus, the first off-chip access request
being a request to access the off-chip device via the master bus;

selectively forward, based on the completion status, a second off-chip access request to the master bus the second off-chip
access request being a request to access the off-chip device via the master bus; and

forward, to the master bus, an on-chip access request received from the first bus, the on-chip access request being a register
master logic (RML) request to write to the CSR, the control circuit forwarding the on-chip access request independent of the
completion status.

US Pat. No. 9,647,947

BLOCK MASK REGISTER KEY PROCESSING BY COMPILING DATA STRUCTURES TO TRAVERSE RULES AND CREATING A NEW RULE SET

CAVIUM, INC., San Jose, ...

1. A method, executed by one or more processors, for compiling data structures to process keys associated with a block mask
register (BMR) of a plurality of BMRs, the method comprising:
for each BMR of the plurality of BMRs:
identifying at least one of or a combination of: i) at least a portion of a field of a plurality of rules and ii) a subset
of fields of the plurality of fields to be masked;

building at least one data structure used to traverse a plurality of rules based on the identified at least one of or a combination
of: i) at least a portion of a field of a plurality of rules and ii) a subset of fields of the plurality of fields to be masked;
and

creating a new rule set from the plurality of rules, wherein the identified at least one of or a combination of: i) the at
least a portion of a field of the plurality of rules and ii) the subset of fields of the plurality of fields to be masked
is masked from the new rule set.

US Pat. No. 9,584,635

BARREL COMPACTOR SYSTEM, METHOD AND DEVICE HAVING CELL COMBINATION LOGIC

Cavium, Inc., San Jose, ...

1. A barrel compactor system for extracting a subset of data from a plurality of data units that together form an input dataset,
the system comprising:
a data unit shift generator that generates an independent shift value for each of the data units within the subset of the
input dataset, wherein the independent shift value indicates a number of positions within the input dataset to shift the associated
data unit;

a data subset identifier that generates a qualifier value for each of the data units of the input dataset, wherein the qualifier
value indicates if the data unit is a part of the subset; and

a barrel compactor comprising an array of a plurality of logic cells that are each associated with a separate logical function,
wherein the barrel compactor receives the input dataset, performs the separate logical function of one or more of the logic
cells on pairs of the data units of the input dataset based on the qualifier values of the pairs of the data units, and shifts
one or more of the data units of the subset based on the independent shift values such that the subset is output in a different
position within the input dataset.

US Pat. No. 9,553,829

APPARATUS AND METHOD FOR FAST SEARCH TABLE UPDATE IN A NETWORK SWITCH

CAVIUM, INC., San Jose, ...

1. A system to support bulk search table update for a network switch, comprising:
said network switch, which comprises:
a plurality of packet processing units configured to process a received packet through multiple packet processing stages based
on search result of a table;

one or more memory units configured to:
maintain and search the table to be searched, wherein each memory row in the memory units is configured to store a plurality
of entries in the table, wherein each entry in the table is in a pair of (key, result) sections;

provide the search result to the packet processing units;
a table managing unit configured to:
accept a plurality of rules on bulk update to the table specified by a control unit;
perform the bulk update on the table based on the specified rules by
retrieving the entries in the table from each memory row within a table update range of the memory units specified in the
rules;

matching a matching pattern and a matching mask specified in the rules with the key and result sections of each of the retrieved
entries, respectively, wherein the matching mask indicates an element of the network switch to be updated and the matching
pattern represents the matching value for the element identified;

updating the key and result sections of each of the table entries via a substitution pattern and a substitution mask specified
in the rules, respectively if a matching is found;

committing the updated table entries back to the memory row within the table update range of the memory units;
said control unit configured to provide the plurality of rules on bulk update to the table managing unit without the control
unit accessing the table directly for the bulk update.

US Pat. No. 9,432,284

METHOD AND APPARATUS FOR COMPILING SEARCH TREES FOR PROCESSING REQUEST KEYS BASED ON A KEY SIZE SUPPORTED BY UNDERLYING PROCESSING ELEMENTS

Cavium, Inc., San Jose, ...

1. A method, executed by one or more processors of a router, for compiling at least one search tree based on an original rules
set, the method comprising:
determining an x number of search phases needed to process an incoming key corresponding to the original rules set, wherein
the original rules set includes a plurality of rules, where each of the plurality of rules includes an n number of rule fields
and where the incoming key includes an n number of processing fields and wherein each of the x number of search phases corresponds
to a respective portion of a plurality of portions of the incoming key;

generating y sets of search trees, where each of the y sets of search trees corresponds to a respective one of the x number
of search phases;

providing the y sets of search trees to a search processor of the router, where each of the y sets of search trees is configured
to process the respective portion of the incoming key;

generating a subject set of search trees of the y sets of search trees using a subject rule field subset of a plurality of
rule field subsets assigned to the respective one of the x number of search phases associated with the subject set of search
trees;

receiving a current search phase rule set from which to generate the subject set of search trees, wherein the current search
phase rule set is at least one of: the original rule set or a rule set received from generating a previous set of search trees;

compiling nodes of the subject set of search trees, wherein the nodes include at least one of: a root node, at least one intermediate
node, and at least one leaf node;

identifying intersections of a leaf node rule set, wherein the leaf node rule set are a subset of the rule set that are in
the at least one leaf node; and

processing, by the router, the leaf node rule set and the identified intersections, to process received packets.

US Pat. No. 9,426,166

METHOD AND APPARATUS FOR PROCESSING FINITE AUTOMATA

Cavium, Inc., San Jose, ...

1. A security appliance operatively coupled to a network, the security appliance comprising:
at least one memory;
at least one processor operatively coupled to the at least one memory, the at least one processor configured to:
walk characters of a payload in an input stream through a unified deterministic finite automata (DFA) stored in the at least
one memory, by traversing nodes of the unified DFA with characters from the payload, the unified DFA generated from subpatterns
selected from each pattern in a set of one or more regular expression patterns based on at least one heuristic; and

walk characters of the payload through at least one non-deterministic finite automata (NFA) stored in the at least one memory,
by traversing nodes of the at least one NFA with characters from the payload, the at least one NFA generated for at least
one pattern in the set, a portion of the at least one pattern used for generating the at least one NFA, and at least one walk
direction for walking characters through the at least one NFA, being based on whether a length of a subpattern selected from
the at least one pattern is fixed or variable and a location of the subpattern selected within the at least one pattern to
optimize performance of run time processing of the at least one processor for identifying an existence of the at least one
pattern in the input stream.

US Pat. No. 9,264,023

SCANNABLE FLOP WITH A SINGLE STORAGE ELEMENT

Cavium, Inc., San Jose, ...

1. A flip flop circuit comprising:
a master latch comprising a storage element, at least two legs including a data leg and at least one scan leg, a first node
of the storage element being driven by the data leg, an opposite node of the storage element being driven by the at least
one scan leg; and

a slave latch coupled to the master latch,
wherein the flip flop circuit has a single clock input and each leg receives the single clock input, and the at least one
scan leg comprises a plurality of scan legs and the data leg performs a logical OR on a plurality of scan mode inputs.

US Pat. No. 10,003,676

METHOD AND APPARATUS FOR GENERATING PARALLEL LOOKUP REQUESTS UTILIZING A SUPER KEY

Cavium, Inc., San Jose, ...

1. A processor configured as a programmable network lookup engine, the engine comprising:a template lookup table, executed by the processor, and configured to receive and identify formats of a plurality of input tokens parsed from header fields of a plurality of network packets; and
a control data extractor, executed by the processor, and configured to extract a set of control bits from each of the input tokens, wherein the set of extracted bits are used to match with predefined values provided by a programmed network protocol;
an instruction table address generator, executed by the processor, and configured to:
perform the matching comparison between the set of extracted control bits and the predefined values by the programmed network protocol; and
generate addresses for a plurality of instruction tables, wherein the instruction tables include instructions for building a plurality of lookup requests per each of the input tokens; and
a plurality of instruction execution hardware logic blocks, executed by the processor, and configured to:
execute the instructions in the instruction tables and generate the plurality of lookup requests in parallel per each of the input tokens, wherein each of the plurality of lookup requests is represented by a lookup key; and
for each token, provide the plurality of parallel lookup keys to a search engine as a super key, where lookup operations for the lookup keys are performed using the super key, which represents contents of the plurality of parallel lookup keys for that token.

US Pat. No. 9,747,109

FLEXIBLE INSTRUCTION EXECUTION IN A PROCESSOR PIPELINE

Cavium, Inc., San Jose, ...

1. A method for executing instructions in a processor, the method comprising:
analyzing, in at least one stage of a pipeline of the processor, operations to be performed by instructions, the analyzing
including:

determining a latency associated with a first operation to be performed by a first instruction,
determining a second operation to be performed by a second instruction, where a result of the second operation depends on
a result of the first operation, and

assigning a value to the second instruction corresponding to the determined latency associated with the first operation;
selecting one or more instructions to be issued together in the same clock cycle of the processor from among a plurality of
instructions whose operations have been analyzed, the selected one or more instructions occurring consecutively according
to a program order; and

executing instructions that have been issued, through multiple execution stages of the pipeline, and delaying a start of execution
of the second instruction by a particular number of clock cycles after the clock cycle in which the second instruction is
issued according to the value assigned to the second instruction.

US Pat. No. 9,665,508

METHOD AND AN APPARATUS FOR CONVERTING INTERRUPTS INTO SCHEDULED EVENTS

Cavium, Inc., San Jose, ...

1. A method for converting interrupts into scheduled events, comprising:
receiving an interrupt at an interrupt controller;
determining a vector number for the interrupt;
determining a status associated with the vector number;
reading properties of an interrupt work from a table in accordance with the vector number when the determined status is waiting;
and

scheduling by a scheduler engine the interrupt work in accordance with the properties of the interrupt work like any other
requested work.

US Pat. No. 9,606,781

PARSER ENGINE PROGRAMMING TOOL FOR PROGRAMMABLE NETWORK DEVICES

Cavium, Inc., San Jose, ...

1. A processing network comprising:
a processing circuit having a programmable parser including one or more parsing engines that parse data packets received by
the processing circuit; and

a parser compiler stored on a non-transitory computer-readable memory and communicatively coupled with each of the parsing
engines, wherein the parser compiler is configured to generate values based on a parser configuration file that when programmed
into a memory associated with each of the parsing engines enables the parsing engines to identify each of a set of different
combinations of packet headers represented by the parser configuration file, wherein the memory associated with each of the
parsing engines comprises ternary content-addressable memory paired with static random-access memory, and further wherein
the parsing engines identify the combination of packet headers of one of the data packets based on a first portion of the
values stored in the ternary content-addressable memory that indicate the combination of packet header of the one of the data
packets and determine what actions to perform with the one of the data packets based on a second portion of the values stored
in the static random-access memory paired with the ternary content-addressable memory that indicate the actions to perform
with the one of the data packets.

US Pat. No. 9,607,672

MANAGING SKEW IN DATA SIGNALS WITH ADJUSTABLE STROBE

Cavium, Inc., San Jose, ...

1. An apparatus for controlling a memory, said apparatus comprising: a memory controller, a data interface that interfaces
with data lines that connect said memory controller to said memory, a plurality of data de-skewers, each of said data de-skewers
being associated with a corresponding one of said data lines, a strobe interface that interfaces with a strobe line that connects
said memory controller to said memory, and a strobe de-skewer that is in communication with said strobe line, wherein each
of said data lines carries a data signal, wherein said data interface is in data communication with each of said data lines,
wherein said strobe interface is configured to apply a timing signal to said strobe line, wherein each of said data de-skewers
is configured to operate in a write mode, in which a bit is to be written to said memory, and in a read mode, in which a bit
is to be read from said memory, wherein said data de-skewer that corresponds to said data line is configured to apply a compensation
skew to a data signal that is carried by said data line, wherein when said data de-skewer is being operated in said write
mode, said data signal represents a bit that is to be written to said memory, and wherein when said data de-skewer is being
operated in said read mode, said data signal represents a bit that has been read from said memory, wherein each of said data
lines has an inherent skew, wherein each of said data de-skewers applies a compensation-skew to a data signal that is being
carried by a data line with which said data de-skewer corresponds, and wherein said strobe de-skewer is configured to skew
said timing signal by an amount that is selected to be one of less than and equal to a maximum delay of said strobe de-skewer
and one of greater than and equal to a minimum delay of said strobe de-skewer, and said strobe-deskewer is configured to skew
said timing signal by a timing-signal skew that has been selected to reduce a collective sampling error that is defined based
at least in part on: (1) an extent to which each of said data signals is sampled at a center of a data-valid window thereof,
and (2) an extent to which at least one of said data signals has a compensation skew applied through one or more delay stages
towards a beginning of a delay line.

US Pat. No. 9,497,117

LOOKUP FRONT END PACKET OUTPUT PROCESSOR

Cavium, Inc., San Jose, ...

1. A method of processing a packet comprising:
merging a plurality of sub-tree responses from a processing cluster, the processing cluster performing rule matching for a
packet, the plurality of sub-tree responses being responsive to lookup requests associated with the packet, each of the sub-tree
responses including a lookup response based on rule matching the packet against a subset of rules specified by a sub-tree;
and

outputting a lookup result to a host processor, the lookup result including at least one of the plurality of sub-tree responses
based on relative priority of the plurality of sub-tree responses.

US Pat. No. 9,471,509

MANAGING ADDRESS-INDEPENDENT PAGE ATTRIBUTES

Cavium, Inc., San Jose, ...

1. An apparatus comprising:
a storage device configured to store memory pages including a first memory page retrieved from the storage device in response
to a page fault issued after an attempt to retrieve data in the first memory page from a physical address space;

an external memory system including a main memory controller coupled to main memory having the physical address space; and
a processor that includes (1) at least one memory management unit coupled to the external memory system, and (2) at least
one central processing unit configured to run a hypervisor at a first access level and at least one guest operating system
at a second access level;

wherein the processor is configured to:
at the second access level, translate from virtual addresses in a virtual address space to intermediate physical addresses
in an intermediate physical address space using mappings in a first page table accessed by the guest operating system;

at the second access level, determine class information for a second memory page mapped by the first page table based on a
classification of virtual addresses within the virtual address space, wherein the class information determined at the second
access level is independent from: (1) any bits used to indicate virtual addresses, and (2) any bits used to indicate intermediate
physical addresses;

at the first access level, translate from the intermediate physical addresses to physical addresses in the physical address
space of the main memory using mappings in a second page table accessed by the hypervisor;

at the first access level, determine class information for the second memory page mapped by the second page table based on
a classification of intermediate physical addresses within the intermediate physical address space, wherein the class information
determined at the first access level is independent from: (1) any bits used to indicate intermediate physical addresses, and
(2) any bits used to indicate physical addresses; and

process class information for the second memory page determined at different access levels to determine processed class information
for the second memory page using a dynamic processing rule.

US Pat. No. 9,443,053

SYSTEM FOR AND METHOD OF PLACING CLOCK STATIONS USING VARIABLE DRIVE-STRENGTH CLOCK DRIVERS BUILT OUT OF A SMALLER SUBSET OF BASE CELLS FOR HYBRID TREE-MESH CLOCK DISTRIBUTION NETWORKS

Cavium, Inc., San Jose, ...

1. A method of generating macrocells of a semiconductor device according to an integrated circuit design including a hybrid
tree mesh clock distribution network, the method comprising:
generating with a computing device and storing on a non-transitory computer-readable medium, a collection of macrocells instantiated
in the integrated circuit design, wherein instance names of the macrocells include placement information for placing the macrocells
in a layout of the integrated circuit design, and further wherein each of the macrocells includes one or more corresponding
base cells;

determining target drive strengths of clock signals for multiple sequential components on the semiconductor device;
determining groups of standard-size clock-driving elements, wherein each of the standard-size clock-driving elements corresponds
to one of the base cells and each of the groups has a drive strength equal to one of the target drive strengths;

combining the clock-driving elements into the groups;
extracting with the computing device, from each of the instance names of the macrocells on the non-transitory computer-readable
medium, the corresponding placement information; and

distributing the clock signals having the target drive strengths on the semiconductor device with the groups of clock-driving
elements as the groups form the hybrid tree mesh clock distribution network of the semiconductor device.

US Pat. No. 10,091,137

APPARATUS AND METHOD FOR SCALABLE AND FLEXIBLE WILDCARD MATCHING IN A NETWORK SWITCH

Cavium, Inc., San Jose, ...

1. A network switch to support scalable and flexible wildcard matching (WCM), comprising:a packet processing pipeline including a plurality of packet processing units configured to process a received packet through multiple packet processing stages, wherein each of the packet processing units is configured to
generate and provide a master key for a WCM request to a memory pool;
process the received packet based on WCM rules of the WCM request returned from the memory pool;
said memory pool including a plurality of memory groups to be searched by the packet processing pipeline, wherein each of the memory groups is configured to
maintain a plurality of WCM tables to be searched in one or more SRAM memory tiles of the memory group;
accept and format the master key generated by the packet processing unit into a compact key based on a bitmap per user configuration, wherein the compact key is shorted in size than the master key;
hash the formatted compact key and perform wildcard matching with the WCM tables stored in the one or more SRAM memory tiles of the memory group using the formatted compact key, wherein certain fields in the compact key are don't care fields for the wildcard matching;
process and provide the WCM rules from the wildcard matching to the requesting packet processing unit.

US Pat. No. 9,866,540

SYSTEM AND METHOD FOR RULE MATCHING IN A PROCESSOR

Cavium, Inc., San Jose, ...

1. A system comprising:
a format block configured to (a) receive a key including one or more bits from a packet, at least one rule for matching the
key, and rule formatting information, the at least one rule having at least one rule dimension, the at least one rule dimension
including a set of one or more bits from a corresponding rule of the at least one rule, and (b) extract each at least one
rule dimension from the at least one rule; and

a plurality of dimension matching engines (DMEs), each DME, of the plurality of DMEs, coupled to the format block and configured
to receive the key and a corresponding formatted dimension, and process the key and the corresponding formatted dimension
for returning a match or nomatch.

US Pat. No. 9,721,627

METHOD AND APPARATUS FOR ALIGNING SIGNALS

Cavium, Inc., San Jose, ...

1. A method of aligning a data signal with a corresponding clock signal, the method comprising:
oversampling the data signal based on the corresponding clock signal;
detecting an indication of a skew between the data signal and the corresponding clock signal based on the oversampling; and
adjusting a variable delay line coupled to the data signal based on the indication of skew detected, the variable delay line
being decoupled from the corresponding clock signal.

US Pat. No. 9,645,790

ADDER DECODER

Cavium, Inc., San Jose, ...

1. A combined adder and decoder circuit, comprising:
n bit inputs, A and B, wherein n is greater than 1;
a hardware logic circuit with n logic stages, each logic stage having one or more gates, configured to:
perform a first operation of propagating a result of a preceding stage on the condition that the sum of A[m] and B[m] is equal
to 0, wherein 0<=m
perform a second operation of performing a bitwise left shift by 2m of the result of the preceding stage on the condition that the sum of A[m] and B[m] is equal to 1;

perform a third operation of performing a bitwise left shift by 2(m+1) of the result of the preceding stage on the condition that the sum of A[m] and B[m] is equal to 2;

an output at stage n providing a decoded sum of inputs A and B.

US Pat. No. 9,671,844

METHOD AND APPARATUS FOR MANAGING GLOBAL CHIP POWER ON A MULTICORE SYSTEM ON CHIP

Cavium, Inc., San Jose, ...

1. A method for controlling power consumption in a multi-core processor chip, the method comprising:
accumulating, at a central controller within the multi-core processor chip, one or more power estimates associated with multiple
core processors within the multi-core processor chip, the accumulating including sending a read command from the central controller
to at least one core processor of the multiple core processors and wherein the at least one core processor receiving the read
command from the central controller updates a parameter value of the read command representing a cumulative sum of power estimates
to produce an updated cumulative sum and forwards the read command with the updated cumulative sum to one other core processor
or to the central controller;

determining a global power threshold based on a cumulative power estimate, the cumulative power estimate being determined
based at least in part on the one or more power estimates accumulated; and

causing power consumption at each core processor to be controlled based on the global power threshold determined;
wherein determining the global power threshold includes:
adjusting a parameter value representing the global power threshold based on the cumulative power estimate determined and
a corresponding average over time, relative to a desired target power.

US Pat. No. 9,628,385

METHOD OF IDENTIFYING INTERNAL DESTINATIONS OF NETWORKS PACKETS AND AN APPARATUS THEREOF

Cavium, Inc., San Jose, ...

10. A method of implementing a network chip, comprising:
processing a packet to identify a unique packet identifier of the packet based on contents of the packet and to form a token
based on extracted header fields of the packet, wherein the unique packet identifier corresponds to a combination of protocol
layers that form a header of the packet;

identifying a port number of an arrival chip port of the packet, wherein the arrival chip port is one of a plurality of chip
ports on the network chip;

forming a key by combining the unique packet identifier of the packet and certain bits of the port number corresponding to
the identified chip port;

determining an internal destination of the token within the network chip based on the key; and
forwarding the token to the destination.

US Pat. No. 9,606,942

PACKET PROCESSING SYSTEM, METHOD AND DEVICE UTILIZING A PORT CLIENT CHAIN

Cavium, Inc., San Jose, ...

1. A packet processing system on a packet processing device, the system comprising:
a non-transitory computer-readable packet memory organized into one or more memory banks;
a packet memory arbiter coupled with read ports and write ports of the one or more memory banks of the packet memory; and
a plurality of system ports that are each associated with one of a plurality of hierarchical clients, wherein each of the
plurality of hierarchical clients and the packet memory arbiter are serially communicatively coupled together via a plurality
of primary interfaces thereby forming a unidirectional client chain, and further wherein all of the plurality of hierarchical
clients write the packet data to or read the packet data from the packet memory via the unidirectional client chain.

US Pat. No. 9,514,246

ANCHORED PATTERNS

Cavium, Inc., San Jose, ...

1. A method comprising:
in a processor:
building an unanchored state graph for unanchored patterns of a plurality of given patterns, the unanchored state graph including
nodes representing a state of the unanchored state graph;

building a separate anchored state graph for given patterns, of the plurality of given patterns, marked as anchored patterns,
the anchored state graph including nodes representing a state of the anchored state graph;

for each node of the anchored state graph, determining a failure value equivalent to a node representing a state in an unanchored
state graph representing unanchored patterns of the plurality of given patterns; and

including a failure value of a root node of the anchored state graph, the failure value being equivalent to a root node of
the unanchored state graph.

US Pat. No. 9,445,107

LOW LATENCY RATE CONTROL SYSTEM AND METHOD

Cavium, Inc., San Jose, ...

1. A video transmission encoding system comprising:
a slice partitioner to ensure a rate control block within an image frame is composed of an integer number of slices from the
image frame, wherein the rate control block includes a plurality of macroblocks;

an encoder to encode the plurality of macroblocks for the rate control block; and
a buffer to store encoded data for each rate control block,
wherein a bit rate for the video transmission system and a size of the buffer are set according to a parameter for the rate
control block, and

wherein the encoder avoids overflow of the buffer by alternately skipping encoding for selected inter-frame macroblocks of
the rate control block and removing prediction residue from intra-frame macroblocks of the rate control block.

US Pat. No. 9,305,129

SYSTEM FOR AND METHOD OF TUNING CLOCK NETWORKS CONSTRUCTED USING VARIABLE DRIVE-STRENGTH CLOCK INVERTERS WITH VARIABLE DRIVE-STRENGTH CLOCK DRIVERS BUILT OUT OF A SMALLER SUBSET OF BASE CELLS

Cavium, Inc., San Jose, ...

1. A method of tuning an integrated circuit including a plurality of capacitive loads and a clock network, wherein the clock
network includes a spine, one or more supporting ribs coupled to the spine, a plurality of base cells coupled to the ribs
and one or more cross-links coupling pairs of the base cells that are on different ones of the ribs together, the method comprising:
determining a collection of macrocells, wherein each of the macrocells is formed by one or more of the base cells and instantiated
in the integrated circuit, wherein each of the macrocells have a drive strength determined by one or more base cells that
form the macrocell, each of the base cells have an input pin and an output pin, and each of the macrocells are for driving
one of the capacitive loads on the integrated circuit;

representing each of the macrocells within a physical database as a group of the base cells that form the macrocell configured
such that the group logically behaves as if the base cells were a single macrocell;

choosing an input and an output of each of the macrocells by marking a location of an input pin of the input pins of the base
cells of the macrocell and marking a location of an output pin of the output pins of the base cells of the macrocell and generating
terminals at the marked locations;

associating the terminals with the macrocells upon which the terminals are located in a table such that each macrocell is
associated with a pair of the terminals that indicate the input and the output of the macrocell; and

tuning the integrated circuit by adjusting the drive strength of one or more of the macrocells based on a size of the capacitive
load that is driven by the macrocell in order to balance a parameter of the clock network, wherein the parameter is measured
from one or both of the terminals of the macrocell.

US Pat. No. 9,866,657

NETWORK SWITCHING WITH LAYER 2 SWITCH COUPLED CO-RESIDENT DATA-PLANE AND NETWORK INTERFACE CONTROLLERS

Cavium, Inc., San Jose, ...

1. A method for network switching with layer 2 switch communicatively coupled co-resident data-plane and network interface
controllers, comprising:
receiving a packet from a communication network at the layer 2 switch;
parsing the packet; and
determining in accordance with a content of the parsed packet whether the packet is to be switched to one of one or more medium
access controllers, or one of one or more packet input processors, or one of one or more network interface controllers of
a network interface resource comprising the one or more packet input processors, one or more packet output processors, the
one or more network interface controllers, and the layer 2 switch, implemented on a chip, wherein

the one or more packet input processors and the one or more packet output processors employ event driven processing.

US Pat. No. 9,753,859

INPUT OUTPUT VALUE PREDICTION WITH PHYSICAL OR VIRTUAL ADDRESSING FOR VIRTUAL ENVIRONMENT

Cavium, Inc., San Jose, ...

1. A method for input/output (I/O) value determination at a processor core, comprising:
generating an I/O instruction comprising at least a physical address;
comparing the physical address from the I/O instruction with a database of physical addresses assigned to I/O devices and
when the comparing is successful

determining the I/O device or a state on the I/O device to receive the I/O instruction in accordance with the physical address;
setting a value in a first register to a value identifying the determined I/O device or the state on the I/O device;
predicting a value to be set in a second register in accordance with the physical address; and
setting a value in a third register.

US Pat. No. 9,720,773

MANAGING REUSE INFORMATION IN CACHES

Cavium, Inc., San Jose, ...

1. A method for managing address translation and caching, the method comprising:
retrieving a first memory page from a storage device in response to a page fault issued after an attempt to retrieve data
in the first memory page from a physical address space of a main memory of an external memory system;

issuing the attempt to retrieve the data in the first memory page in response to a cache miss issued after an attempt to retrieve
the data in the first memory page from a first cache line of a first cache of the external memory system; and

managing address translation and caching from a processor that includes (1) at least one memory management unit coupled to
the external memory system, and (2) at least one central processing unit configured to run a hypervisor and at least one guest
operating system, the managing including:

translating from virtual addresses in a virtual address space to intermediate physical addresses in an intermediate physical
address space;

translating from the intermediate physical addresses to physical addresses in the physical address space of the main memory;
determining reuse information for memory pages based on estimated reuse of cache lines of data stored within the memory pages;
storing the determined reuse information independently from: (1) any bits used to indicate virtual addresses, (2) any bits
used to indicate intermediate physical addresses, and (3) any bits used to indicate physical addresses; and

using the stored reuse information to store cache lines in a selected group of multiple groups of cache lines of the first
cache.

US Pat. No. 9,595,003

COMPILER WITH MASK NODES

Cavium, Inc., San Jose, ...

1. A method comprising:
building a decision tree structure including a plurality of nodes using a classifier table having a plurality of rules representing
a search space, the plurality of rules having at least one field, each node representing a subset of the search space;

building the decision tree structure including, at each node, (a) dividing the subset of the search space represented by the
node into smaller subsets by (i) determining a node type for the node, the node type determination enabling a combination
of node types in the decision tree structure, (ii) selecting one or more fields of the at least one field and selecting one
or more bits of the selected one or more fields based on the node type determined for the node, a node type of a parent node
of the node, and a consumed bit indicator for the node, the consumed bit indicator specifying all bits consumed for search
space division by each ancestor of the node, and (iii) cutting the node into child nodes on the selected one or more bits
to create the smaller subsets and allocating the created smaller subsets to the child nodes;

(b) updating the consumed bit indicator to specify the selected one or more bits as utilized and associating the updated consumed
bit indicator with each of the child nodes; and

storing the built decision tree structure.

US Pat. No. 9,582,251

ALGORITHM TO ACHIEVE OPTIMAL LAYOUT OF DECISION LOGIC ELEMENTS FOR PROGRAMMABLE NETWORK DEVICES

Cavium, Inc., San Jose, ...

1. A processing network for receiving a source code having one or more code paths that are each associated with one or more
conditions and one or more assignments of the source code, the network comprising:
a plurality of processing elements on a programmable microchip, wherein each of the processing elements comprise one or more
instruction tables and a logic cloud including a grid of logic devices, wherein a first column of the grid receives logic
cloud input and a last column of the grid transmits logic cloud output;

a plurality of on-chip routers on the microchip for routing the data between the processing elements, wherein each of the
on-chip routers is communicatively coupled with one or more of the processing elements; and

a compiler stored on a non-transitory computer-readable memory and comprising a logic cloud mapper that, based on the grid
of logic devices, assigns functions to one or more of the logic devices and routes operable connections between the one or
more of the logic devices such that the logic cloud, in conjunction with the instruction tables, implement the conditions
and the assignments of the code paths of the source code,

wherein each function corresponds to one or more device input values and a device output value, and further wherein the logic
device assigned one of the functions will output the device output value in response to inputting the device input values,

wherein the device input values and the device output value are selected from the group consisting of primary inputs that
are to be received from the logic cloud input, intermediate results that are to be received from one of the logic devices
or primary outputs that are the logic cloud output to be transmitted to the instruction tables,

wherein the logic cloud mapper determines all possible serial chains of the functions that can be formed such that:
the device input values of the function at the start of each of the chains are one or more of the primary inputs;
the device output value of the function at the end of each of the chains is one of the primary outputs; and
for every pair of the functions that are adjacent within each of the chains, the device output value of the preceding function
of the pair matches at least one of the device input values of the other function of the pair.

US Pat. No. 9,569,362

PROGRAMMABLE ORDERING AND PREFETCH

Cavium, Inc., San Jose, ...

1. A circuit for controlling access to a memory, comprising:
a register storing an order configuration, the order configuration indicating rules for ordering access requests;
a request buffer configured to receive first and second access requests;
prefetch buffer configured to receive the first and second access requests in parallel with the request buffer; and
a control circuit configured to:
forward the first access request to a memory;
monitor the completion status of the first access request;
selectively forward or suspend the second access request based on the order configuration and the completion status of the
first access request; and

in response to suspending the second access request, forward a prefetch command to the memory and forward a third request
from the prefetch buffer to the memory.

US Pat. No. 9,565,136

MULTICAST REPLICATION ENGINE OF A NETWORK ASIC AND METHODS THEREOF

Cavium, Inc., San Jose, ...

40. A method of implementing a replication engine, the method comprising:
accessing a first table and a second table;
receiving a packet having packet data, a mirror bit mask vector and a pointer to an entry in the second table;
determining whether a switchover feature is enabled;
upon the determination that the switchover feature is not enabled, mirroring the packet according to a first linked list stored
in the second table and to the mirror bit mask vector, wherein each entry in the first linked list includes an evif field
storing destination information of where to send mirrored packet data; and

upon the determination that the switchover feature is enabled, replicating the packet according to a first live link of a
second linked list stored in the second table, wherein each entry in the second linked list includes a link indicating a route
upon which the packet data is able to be sent to a desired destination.

US Pat. No. 10,110,393

PROTOCOL SWITCHING OVER MULTI-NETWORK INTERFACE

Cavium, Inc., San Jose, ...

4. A method for switching between mirroring and streaming protocols for processing multimedia content at a sink device, comprising:establishing a layer 2 (L2) connection at the sink device;
utilizing a Real Time Streaming Protocol (RTSP) control protocol to establishing a first session using the mirroring protocol;
switching between the mirroring protocol and a Digital Living Network Alliance (DLNA) streaming protocol by:
receiving at the sink device using the L2 connection a Digital Living Network Alliance (DLNA): set audio/video (AV) transport universal resource identifier (URI) request;
sending from the sink device using the L2 connection a DLNA: set AV transport response;
receiving at the sink device using the L2 connection a DLNA: play request;
sending from the sink device using the L2 connection a Wi-Fi Display (WFD): real time streaming protocol (RTSP) trigger pause request;
receiving at the sink device using the L2 connection a WFD: RTSP pause response;
sending from the sink device using the L2 connection a Digital Living Network Alliance (DLNA): play response; and
establishing a session using the DLNA streaming protocol at the sink using the L2 connection.

US Pat. No. 9,620,213

METHOD AND SYSTEM FOR RECONFIGURABLE PARALLEL LOOKUPS USING MULTIPLE SHARED MEMORIES

Cavium, Inc., San Jose, ...

1. A system on-chip configured to support N parallel lookups using a pool of shared memories, the system on-chip comprising:
a pool of T×M shared memories are grouped into T tiles;
M index converters for each of N lookup paths;
a central reconfigurable interconnect fabric for connecting N input ports to the T tiles;
an output reconfigurable interconnect fabric for connecting the T tiles to N output ports; and
N output result collectors, wherein each of the N output result collectors is per one lookup path, wherein the system on-chip
is configured to perform N parallel lookups against the pool of T×M shared memories along the N lookup paths, wherein N, T
and M are positive integer values.

US Pat. No. 9,601,181

CONTROLLED MULTI-STEP DE-ALIGNMENT OF CLOCKS

Cavium, Inc., San Jose, ...

1. A method for controlling operation of a system that extends across at least two clock domains and that comprises a plurality
of functional units in different clock domains, wherein said plurality of functional units comprises a first functional unit
and a second functional unit, said method comprising:
driving said first functional unit in a first clock domain with a first clock-signal,
driving said second functional unit in a second clock domain with a second clock-signal,
receiving said first clock-signal from a first clock-signal source,
providing a phase delay with respect to said first clock-signal from a component within said first functional unit for use
within said first functional unit,

providing said second clock-signal by delaying said first clock-signal using a first delay line,
receiving reference information indicative of said phase delay from a component within said first functional unit by interrogating
said component, wherein a target time-domain offset between said first and second clock-signals is based on said phase delay,
and

dynamically controlling said first delay line to cause said second clock-signal to sustain a temporal offset that causes an
offset between said first and second clock-signals to take a step toward said target time-domain offset, and wherein said
step has a first step-size that is independent of a difference between said target time-domain offset and said offset between
said first and second clock signals.

US Pat. No. 9,946,671

METHODS AND SYSTEMS FOR PROCESSING READ AND WRITE REQUESTS

Cavium, Inc., San Jose, ...

1. A machine implemented method for processing an input/output (I/O) request sent by an initiator adapter to a target adapter coupled to a target controller, comprising:generating the I/O request by the initiator adapter of a computing device that interfaces with the target adapter;
indicating by the initiator adapter an I/O request pattern to the target adapter, where the I/O request pattern indicates if the I/O request is sequential in nature;
when the I/O request is a read request that is sequential in nature as indicated by the I/O request pattern, the target adapter notifying the target controller to read ahead data associated with other sequential read requests;
storing the read ahead data at a cache such that data for the other sequential read requests is provided from the cache instead of a storage device managed by the target controller; and
when the I/O request is a sequential write request as indicated by the I/O request pattern that also indicates that data associated with the sequential write request is to be accessed within a threshold duration, then processing the sequential write request by claiming space at the cache.

US Pat. No. 9,544,402

MULTI-RULE APPROACH TO ENCODING A GROUP OF RULES

CAVIUM, INC., San Jose, ...

1. A method for encoding a plurality of key matching rules grouped in a chunk, each of the key matching rules beginning with
a header and having at least one dimension, the method comprising:
in a rule encoding engine, communicatively coupled to memory and provided with a chunk of key matching rules, building a multi-rule
corresponding to the chunk comprising:

storing in the memory a multi-rule header of the multi-rule, the multi-rule header representing, collectively, a plurality
of headers stored one after the other, the multi-rule header being decoded by a rule matching engine in a single decode operation
to extract the plurality of headers of the key matching rules, wherein the plurality of headers include values which control
the rule matching engine processing of the key matching rules, including dimensions, the rule matching engine formats the
key matching rules based on a key and matches the key matching rules against the key to find a match based on the values stored
in the plurality of headers.

US Pat. No. 10,141,949

MODULAR SERIALIZER AND DESERIALIZER

Cavium, LLC, Santa Clara...

1. A deserializer circuit, comprising:an input buffer configured to receive a serial data signal; and
an array of cells, each cell comprising an input flip-flop and an output flip-flop, the array of cells including:
a bottom row of cells configured to receive a plurality of partial words in parallel from the input buffer to the input flip-flops of the bottom row of cells, the plurality of partial words corresponding to the serial data signal;
at least one intermediary row of cells configured to 1) receive the plurality of partial words from a preceding row of cells, and 2) transfer a subset of the plurality of partial words to a successive row of cells of the array of cells; and
a top row of cells configured to receive one of the plurality of partial words from a preceding row of cells of the array of cells;
the array of cells outputting a word in parallel via the output flip-flops, the word corresponding to the plurality of partial words.

US Pat. No. 9,590,797

EDGE RATE CONTROL CALIBRATION

Cavium, Inc., San Jose, ...

1. A circuit comprising:
an oscillator providing a set of clock phase signals;
a main edge rate controller (ERC) coupled to the oscillator and configured to adjust an edge rate of each clock phase signal
of the set of clock phase signals;

an interpolator coupled to the main ERC and configured to interpolate the adjusted set of clock phase signals to provide at
least one desired phase output signal;

an edge rate controller calibrator comprising a ring oscillator including at least three ERCs connected in a loop, a counter
configured to count a number of cycles of the ring oscillator over a given period, and a finite state machine (FSM) configured
to compare the counter count to a given value corresponding to an operating frequency of the circuit and to adjust operation
of the circuit based on the comparison.