• Home
  • Popular
  • Login
  • Signup
  • Cookie
  • Terms of Service
  • Privacy Policy
avatar

Posted by John Dev


12 Jan, 2025

Updated at 20 Jan, 2025

How to optimize lookups in Generic functions? without static?

It might have been asked a million times. I am just bad at finding the solution.

The basic problem: I have 3000+ struct types that share common trait. I want to implement JSON serialization logic for the trait that dispatches to the concrete types. All is well if I settle for linear complexity of iterating the types. I struggle to figure out how to use Map that can give logarithmic or constant performance.

Here is the basic code I have now:

    pub trait Polymorphic: std::fmt::Debug {
        fn obj_type_id(&self) -> TypeId;
        fn to_polymorphic_ref<'a>(&'a self) -> &'a dyn Polymorphic;
        fn to_polymorphic_mut<'a>(&'a mut self) -> &'a mut dyn Polymorphic;
    }

    impl serde::Serialize for dyn Polymorphic {
        fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
        where
            S: serde::Serializer,
        {
            serialize_polymorphic(self, serializer)
        }
    }

    fn serialize_polymorphic<S>(p: &dyn Polymorphic, serializer: S) -> Result<S::Ok, S::Error> 
        where S: serde::Serializer {
        if let Some(cat) = cast_ref::<Cat>(p) {
            return Cat::serialize(cat, serializer);
        }
        if let Some(dog) = cast_ref::<Dog>(p) {
            return Dog::serialize(dog, serializer);
        }
        if let Some(animal) = cast_ref::<Animal>(p) {
            return Animal::serialize(animal, serializer);
        }
        if let Some(data_object) = cast_ref::<DataObject>(p) {
            return DataObject::serialize(data_object, serializer);
        }
        Err(serde::ser::Error::custom("Unknown type"))
    }

Goal is to use constant complexity of matching the type and not linear.

I want to change the series of 'if let' statements to map lookup using the TypeId of the objects. I have this working with de-serialization.

Functionally typetag and/or erased_serde work. However the 3000+ struct types and around 400 trait types in which the data structs are arranged explode the executable size and compile time enormously. I think typetag generates a ton of code for each of the 400 trait types. That is not necessary in my case. In my case the whole type system is a classic Java like library with single root type and very deep inheritance relationships (I tried user tree of enums first and that works great except it is inhumanely complex to write code against 10 level deep nested enums. That is why I switched to traits with some traitcast voodoo). So in my case a single serialize and deserialize implementation shared by all traits works great.

I did the deserialization part more or less.The good thing is that the visitor in serde deserialize is not so generic

The subjectively more trivial Serialize trait proves a bit of a problem. Some things I considered so far:

a. use a static HashMap<TypeId, SerializaitonFn> - does not work as even when a static is embedded in generic function there is only a single static instance. It is explained in the Rust book.
b. use a two level mapping using the TypeId of the Serializer upon which serde::Serialize trait is declared generic e.g. HashMap<SerializerTypeId, HashMap<SerlializedTypeId, SerializationFn>>. This does not work unless I enter unsafe waters as I cannot declare the map with the SerializaitonFn using the generic type of the serializer
c. Not tried yet but may be I need to rip out the erased_serde type erasure logic. I cannot really get my head around it. That is what is stopping me. A pointer to a good read on this will help

For what it is worth GPT 4o and Gemini did not provide a working solution. They are very bad with Rust. If they fix one problem in such complex puzzle they would add 4 or 5 new problems to think through.

Any advice or help will be appreciated.

1 post - 1 participant

Read full topic