Cryptographically verifiable, distributed dependency reviews
reviewer: HeroicKatora
$ cargo crev repo fetch url https://github.com/HeroicKatora/crev-proofs
$ cargo crev id trust -sApEowWcAS9J0R7aO18cghvhLBpuMhyeUuWQq_fits
repo: https://github.com/HeroicKatora/crev-proofs
Please, use mobile in landscape.
The implementation is rather conservative on many fronts and requires very
strong, sometimes even unnecessary, preconditions for all operations. But
this makes reasoning easier since you work with a consistent set of
assumptions. This is in contrast to zerocopy
which has at three differing
sets.
The biggest leap of unsafety is the assumption that slices strides and arrays
agree with the size of their elements. This is not quite likely to change,
ever, but it should be noted nevertheless. The exact wording of Pod
also
allows us to smuggle a type through its requirements. It takes some care to
try and only allow types that might violate the assumption by requiring a
defined repr
but in its wording forgets that repr(packed)
can be applied
to repr(rust)
. Thus, the following type conforms to the wording but not the
spirit behind it.
\#[repr(packed)] struct BadSize(u16, u8)
The easiest fix would be to explicitly list the requirement that the size is
divisible by the alignment. This defines the stride to agree with the size.
Has fuzzing integration and been scrutinized by a previous review. I've done
version 0.1.2 in some detail and the only changes merged since then are bug
fix PRs addressing those. Mostly worried about inadvertently introduced bugs
in future changes due to awkward interfaces and implicit trust between
different components within the code.
For example, code in struct_item.rs
relies on the reader providing only 4
byte aligned buffers instead of reasserting that fact as an internal
invariant. However this invariant is relied upon for an aligned pointer load
later.
Similarly some other C style patterns are prominent. Instead of functions
testing an invariant and constructing a result based off of it, boolean
return and construction by the caller is used. Example:
self.assert_enough_struct(offset, desc_size)?;
let desc_be = unsafe {
&*((&self.struct_block[offset..]).as_ptr() as *const PropertyDesc)
as &PropertyDesc
};
This is fine in its current usage as far as I can tell but it's not very
stable with regards to possible future changes.
This also means that manual arithmetic is used for bounds checks which is
prone to missed overflow considerations etc.
A remaining antipattern is that of a byte output buffer: StructItem
offers
reading its value as strings or a u32
list. But instead of an iterator over
the backing memory the implementation takes an mutable reference to a byte
slice, manually aligns it to fit the output type, casts it, and writes the
data types &'_ str
and u32
. I have not found concrete misbehaviour from
this but it seems awkward.
Liberal use of unsafe
and sparse validation of in puts indices and offsets.
In principle, the dtb format lends itself well to this use as the file format
itself already requires the alignment of many members and takes care to have
naturally packed structs–with aligned members but no padding.
It is thus possibly safe to map many parts of an immutable input directly to
structs marked as repr(C)
, which also correctly appears.
However, the unsafe
blocks contain only few indications of consideration of
their safety. Sometimes alignment checks appear obviously above but most
iterators implicitely trust their callers on the alignment of internal
buffers. It also seems that not all functions relying on unsafe
preconditions are marked unsafe
. This applies to internal functions only
but may make the crate more brittle than necessary.
Another antipattern is that of a byte output buffer: A StructItem
offers
reading its value as strings or a u32
list. But instead of an iterator over
the backing memory the implementation takes an mutable reference to a byte
slice, manually aligns it to fit the output type, casts it, and writes the
data types &'_ str
and u32
. I have not found concrete misbehaviour from
this but it seems awkward.
No references and not compliant to all possible implementations of the POSIX
interfaces it utilizes. Manifestation of the unsafety requires the hostname
to be set to a 255 byte string by the machine operator and is thus not
directly exploitable, especially on Linux or glibc where it is impossible to
configure the system into such a state.
This is not meant to assign blame towards the author, Crate repository is
properly abandoned and reverse dependencies should switch to another
implementation.
I'm the author and this packet avoids internal unsafe wherever it can. In its
current state it only requires it to perform a rather trivial cast for a
transparent [u8]
wrapper type.
The package also considers resource exhaustion and interfaces with
potentially panicking, internal allocation generally come with a twin that
instead tries to perform the operation inplace and is a no-op error if that
is not possible.
Trait FromByteSlice
relies on associated item FromByteArray::Array
being
a proper byte array. However, this latter trait and associated type are safe
to implement with an almost arbitrary other type (a few trait std bounds
don't stop any implementation).
Version 4.0 fixes UB from casting byte slice to integer reference without
any alignment checks.
A bit puzzling that the inline implementation does not use UnsafeCell
directly but only through Cell
. This adds a Sync impl which explicitely can
not exist for Cell itself. But the Once is enough to protect against races
even though there is no real benefit from using Cell
over UnsafeCell
.
Has unsoundness in a major, safe interface.
The main utility for Vec can reuse an allocation of differing element size
thus violating the explicit requirements of Vec::from_raw_parts and in
particular the allocator contract, potentially leading to memory corruption
on drop of the resulting Vec.
The interface affected are (maybe not complete):
MapVecInPlace::map
MapVecInPlace::map_in_place
MapVecInPlace::filter_map
MapVecInPlace::filter_map_in_place
An analysis of the code to show the issue:
In a macro, this code checks for various size and alignment constraints on
deciding whether to execute an in-place branch or a fallback (that may panic
in some variants).
unsafe {
if size!($a) == 0 || size!($b) == 0 {
$zero
} else if align!($a) != align!($b) {
$alignment
} else if $f(size!($a),size!($b)) {
$incompatible
} else {
$ok
}
}
Already a naming issue appears, as the $incompatible
branch is actually
taken when f
returns true
and some instantiation has |a,b| a==b
as this
argument. Consequently, the incompatible
parameter is filled with the
in-place branch in the fallback
branch where the parameter is $ok:expr
Note that the check for map
is |a,b| a%b==0
and it invokes $fallback
with the $ok:expr
argument set to map_vec(self, f)
(note: the f
here is
from the parameters of map
). The map_vec
function is an unsafe
function
eventually doing the equivalent of
let (ptr, len, cap) = /* The obvious */;
// Some transformation code on raw
.
Vec::from_raw_parts(ptr, len, cap)
This violates very clearly the contract which states:
ptr
's T needs to have the same size and alignment as it was allocated with
Not sure about the strict correctness of memrchr under stacked borrows but
the forward functions definitely seem fine. That said, miscompilation seems
unlikely since stacked borrows is more aggressive than llvm's noalias and raw
pointers do not receive that tag anyways.
With a prior parsing bug fixed and Rust soon stabilizing floating point to
integer conversion without UB the previous reservations no longer hold. It
looks stable as is and doesn't attempt anything far fetched.
Fixed panic when parsing floating point literals
Not quite ready for untrusted input due to panics and not fuzzed. Minor
soundness concerns for floating point operations, rooted in Rust languageas
operator not having fully specified behaviour (yet). All is well for the
integer part of the library.
On first glance you'll find a lot of unsafety but most of it is (now) benign.
The largest parts are casting a transparent, repr-C wrapper struct to native
arrays or slices and forwarding impls of bytemuck::{Pod, Zeroable}
accordingly. There's a derive crate for bytemuck that may be used instead.
There are not a lot of safety comments but not a lot of invariants used
either.
A longstanding soundness issue of interpreting a user input type T
as
bytes... This goes back to 0.2
at least where we have the following trait:
pub trait ComponentBytes<T> {
fn as_slice(&self) -> &[T];
fn as_mut_slice(&mut self) -> &mut [T];
// Provides: does this obvious transmutation cast of `as_slice` result.
fn as_bytes(&self) -> &[u8] { ... }
}
This is so unsound, the user can even return an wrong slice of an arbitrary
and just let this trait do the dirty work of inspecting its bytes.
In a more recent version we're allowed to even write into that byte slice!
Let us create null references, invalid enums, whatever your heart desires.
Small, stable, and highly useful.
Most of the code is straightforward but to increase usability there are threeunsafe
blocks. It seems feasible to write a completely safe version of the
API without hurting the main use case too much. The benefit, however, would
be slim except for those focussed on purging even well-reviewed localunsafe
. Let's explain how we agree with all reasoning, and documentation.
src/lib.rs:351:ScopeGuard::into_inner
: Here it performs a manual move from
a to-be-forgotten value that can't be destructured due to the rules around
Drop types. It's slightly awkward to use mem::forget
after the copy
instead of wrapping the value in a ManuallyDrop
at the start but the
comment correctly explains why the order is equivalent and still sound. It
also seems odd that this pseudo-destructuring does not return the function
instance but only the value and thus always drops the functor which
potentially also drops any contained captured value and thus might panic.
That seems more ergonomic in the common case or when a pure function is used,
so it isn't critical.
src/lib.rs:422:impl Sync
: This is likely the most involved here but the
comment explains it well enough. It might have been/be beneficial to create a
very small newtype wrapper that enforces that no such access by reference can
take place in future versions of the crate also. This would also reduce the
mental load, as this impl with all its type bounds would not be explicitly
required. It would nevertheless not reduce the amount of necessary soundness
reasoning so this is just polish.
src/lib.rs:455:Drop::drop for ScopeGuard
: Same as into_inner
but since
this is within the Drop impl there is no need (and possibility) to forgetself
so this is self-explanatory. This could share a common core with theinto_inner
method, a private inherent method that ptr::read
's the stored
values. Avoiding duplication here would be more important if different ways
of destructuring existed.
The crate contains no comments on how the supposed constant time operations
are actually achieved. There is no mention of a memory model, no discussion
of references, black boxes, speculation, optimization, etc. And on top of
these theoretical shortcomings the crate also contains no tests to verify any
of their assumptions. This means: no asm
output review, no smoke tests for
operation's time dependencies and a general lack of even functional tests.
The classify
method is just a glorified constructor and even contains a
generic which will in general have no guarantees and is totally misplaced
here. And declassify
doesn't really do anything since the whole type state
can always be dereferenced into the underlying type if the compiler pleases
(and, oh, it will).
An unexplained fill-horn of converters is added. Since they do not perform
any more robust sequencing or basic optimization barriers, one might as well
use the classify
constructor instead. They are even declared #[inline]
!
Some allocating to-bytes converters are offered. Despite plainly leaking all
values into the global heap without any protection or zeroing after
expiration, they have no documentation, unlike much less dangerous methods.
Despite the fact that several operations are disallowed as a 'security
feature', an inconsistent other set is allowed. Overflow checked operations
will, almost by definition, at least contain conditional moves if compiled
naively using the built-ins. But for some unexplained reason, the division
operation of all is untrusted!
Many other such ops -- with invalid and panicking inputs or similar -- exist:
Rotations will probably do a branch on the input value, additions overflow
check and complain etc. in debug code. Maybe the bitwise operations are not
totally broken.
WORST! Some operations are transliterated from WireGuard. But they contain
TYPOS that plainly contradict newly introduced variable names. How do they
even work? It is not explained.
By the way, the code is copied from a GPL licensed project (WireGuard) but
the crate itself advertises an MIT license. So this might also be a licensing
risk for the future.
The crate contains no comments on how the supposed constant time operations
are actually achieved. There is no mention of a memory model, no discussion
of references, black boxes, speculation, optimization, etc. And on top of
these theoretical shortcomings the crate also contains no tests to verify any
of their assumptions. This means: no asm
output review, no smoke tests for
operation's time dependencies and a general lack of even functional tests.
The classify
method is just a glorified constructor and even contains a
generic which will in general have no guarantees and is totally misplaced
here. And declassify
doesn't really do anything since the whole type state
can always be dereferenced into the underlying type if the compiler pleases
(and, oh, it will).
An unexplained fill-horn of converters is added. Since they do not perform
any more robust sequencing or basic optimization barriers, one might as well
use the classify
constructor instead. They are even declared #[inline]
!
Some allocating to-bytes converters are offered. Despite plainly leaking all
values into the global heap without any protection or zeroing after
expiration, they have no documentation, unlike much less dangerous methods.
Despite the fact that several operations are disallowed as a 'security
feature', an inconsistent other set is allowed. Overflow checked operations
will, almost by definition, at least contain conditional moves if compiled
naively using the built-ins. But for some unexplained reason, the division
operation of all is untrusted!
Many other such ops -- with invalid and panicking inputs or similar -- exist:
Rotations will probably do a branch on the input value, additions overflow
check and complain etc. in debug code. Maybe the bitwise operations are not
totally broken.
WORST! Some operations are transliterated from WireGuard. But they contain
TYPOS that plainly contradict newly introduced variable names. How do they
even work? It is not explained.
By the way, the code is copied from a GPL licensed project (WireGuard) but
the crate itself LOOKS like an MIT license but actually mentions APACHE
within the text. So this might also be a licensing risk for the future.
No internal use of unsafe. Only an get_unchecked unsafe api without bounds
that translates directly to Vec and does not exploit the possible unsafety in
its implementation to the fullest extent (unreachable instead of
unreachable_unchecked).
No unsafe code. This review includes the binary which does some file IO. I
believe one might get the binary to overwrite arbitrary files by symbolically
linking files (and the binary) such that they are located next to each other.
This puts it into a portable
mode, which means it does not only write to
its own XDG directories. Not a real problem as that works for a lot of
applications.
As a side effect of the linear time requirements it also ensures that
allocations are bounded. However, there is no built-in way to interoperate
with any custom limits, precalculate required memory, or reuse allocations
made separately before.
Very focussed forbid(unsafe)
crate.
The minor flaws are introduced by trying to mimic the std interface too much
and too quickly. This leads to some misbehaviour in remove and drain, several
known slowness issues—algorithmic complexity as well as efficiency of
operations—and poorly documented interfaces with slightly non-standard panic
behaviour, and unknown decision making for adding them.
Why implement fmt::*
interface when the standard library does not for Vec?
How can Extend be called without risking panicking?
No safety critical flaws though and I expect most to be fixable within the
interface that was published (except the abundance of trait impls that is a
matter of taste).
Best summarized as a flood of unsafe blocks.
The src/lib.rs
already looks somewhat suspicious. We have a thread local
containing a pointer that is unsafely accessed. It is set by transmuting a
local reference to a dynamic trait object which has its lifetime erased. This
particular portion might be fine if no code that uses the trait depends on
that lifetime. However, all of this is worryingly done without any statement
of purpose, any reference to the overall structure, and little to no
comments.
Then we look into src/scheduler.rs
and find our worst nightmares come to
life. Here's a short list of unexplained stuff:
uncommented impls of Sync and Send
an adhoc implementation of an intrusive queue/linked-list, giving a blog
post about C (!) as the source. Maybe one of the better sources, I do not
know, but it appears somewhat unmotivated and is intertwined with the rest of
the code.
several different UnsafeCell that have been encapsulated with telepathy,
apparently, since there are no comments explaining them nor their access.
unsafe blocks spanning ~70 lines
and the amazing unsafe fn ptr2arc(ptr: *const T) -> Arc containing
let anchor = mem::transmute::<usize, Arc<T>>(0x10);
without further explanation. Note that T
is a type parameter that is
exposed to the user through the type parameter P
of CurrrentThread
.
This is restricted to tokio-executor::park::Park
and provides an
associated type which is later used as the T
in question. Note that a
public constructor exists.
This is slightly unsound at best, if the node must be aligned to more than
that function guarantees, and horrible with regards to how the pointer is
used as a manual memoffset calculation afterwards.
The part where Node
is used to ignore the lifetime of a contained object
and only its Drop implementation double panicking protects against using and
dropping the object out of its own lifetime is the least worrying here. Also,
since the comment refers to some type parameter T
that doesn't exist in the
impl.
CI coverage is good, running both rustfmt and clippy on every push,
regardless of branch. It does not seem to run specifically on pull requests,
which, I believe, leads to Github not checking the merge commit
specifically but instead just individual commits on the remote branch. Not
too consequential in any case, just slightly unorthodox.
Since 0.2.0 (the previous review) decoding has been added which seems to have
resulted in an additional internal dependency (bit_field) that has not yet
reached 1.0
. However, it remained stable for the past year. It's not
immediately clear why a simple shift-and-mask would not be clean enough here,
the traits provided by the dependency seem much more relevant if they help
enable an interface or generic implementation—which is both not required
here.
The decode interface looks sound and decently usable. There is an interesting
difference in that its _with
variant returns a count on success whereas the
encoding methods do not. Additionally, it's unclear how to efficiently and
properly react to an OverflowError
because neither the number of
successfully consumed input nor the actually required output would be
available to the caller.
The encode interface is unchanged since 0.2.0
. Thus I'm preserving the
previous review.
Contains unsafe code as a mechanism to eliminate bounds checks by asserting
properties with unreachable_unchecked. However, this appears to make proper
(and only proper) use of the unsafe assertion that strings are valid UTF-8
given by the standard library.
The unsafe code is documented and the reasoning behind it is explained. The
only minor flaw is that it does not properly explain that the validity
invariant used is being tested for. Since it does reference both utf-8 and
the byte length in its argument this should be straightforward to the reader
though.
A slightly cleaner way of testing the conditions might have been matching the
first byte with integer ranges, which would have eliminated the last bounds
check without an additional unreachable assertion.
No strong
rating due to superficially unimportant dependency, and concerns
about the maturity of the interface (see above).
It remains unclear if an iterator based design instead of callbacks was
considered, potentially placing control flow in the hands of the caller. This
may be important for some cases where buffers need to be flushed while
efficiently preserving the decoding state.
Contains unsafe code as a mechanism to eliminate bounds checks by asserting
properties with unreachable_unchecked. However, this appears to make proper
(and only proper) use of the unsafe assertion that strings are valid UTF-8
given by the standard library.
The unsafe code is documented and the reasoning behind it is explained. The
only minor flaw is that it does not properly explain that the validity
invariant used is being tested for. Since it does reference both utf-8 and
the byte length in its argument this should be straightforward to the reader
though.
A slightly cleaner way of testing the conditions might have been matching the
first byte with integer ranges, which would have eliminated the last bounds
check without an additional unreachable assertion.
No strong
rating since the crate is incomplete and the trust of maintainers
is not clearly established.
© bestia.dev 2023, MIT License, Version: 2023.608.1636
Open source repository for this web app: https://github.com/bestia-dev/cargo_crev_web/
The crate gained quite a bit of interface since last time. I'm not quite sure
how I feel about this at the moment but understanding definitely suffered
from it.
Of most concern is definitely
TransparentWrapper
which relies on theinternal implementation detail that the layout of a pointer type itself does
not change for transparent wrappers. This premise seems a very unlikely to be
invalidated from changes but nevertheless departs with only relying on
stabilized and fully RFCed properties.
Other than that, no critical changes and a continued trend of being cautious.
Notably the implementation of Contiguous guards against bad implementations
despite being unsafe to implement, the
offset_of
macro is completelysafe(!)—a welcome change for such macros—and there are MIRI tests in CI.
The test suite could be a lot bigger but some tests are obviously foiled by
MIRI rejecting some sound and UB-free code that relies on alignment checks,
to avoid those incidentally succeeding in unsound code.