Hi Peter,
thanks for comments!
testLoadFieldPeersOnePeerInfoPortZero(), testLoadFieldPeersOnePeerInfoPortTooBig() passing a 0 port is semantically incorrect. My question in general is whether we should do a full validation of the data in the reader (both correct format and correct value ranges) or should we only check for correct format and assign Null values to field with invalid ranges. Using Nulls would allow us to use the information that is correct (e.g. a single bad peer port would not render all the 20 returned peers invalid). What do you think?
I was thinking about that, too. In the port validation I decided to allow every 16b value (including zero) and relinquish the desicion whether it’s valid (semantically) or not to a class that will use it (the port value). Similar problem is with the QHostAddress class - if you pass it IP like 333.333.333.333, it’s a “valid” value for the class, because it tests only range (hope I haven’t missed something). I’m trying to validate as much as possible and the port, IP and DNS name are the only fields I’m not 100% sure what exactly we should check and “how much”. So maybe we should document what ranges/values can getters return so a caller can bargain for that? Problem with Null values is that we can’t use them on mandatory fields, like info_hash/peer_id, because there is specified that they must be exactly 20B long and can’t be empty. There is no problem with optional fileds, though (they keep their default value if parsed data are not valid) - but ports are also mandatory in the peer info dictionary. I think that our TrackerResponse/Torrent classes should return only “valid” data (port/IP/ DNS name issues), but I also agree that it’s too much strict to mark whole response invalid if one of, lets say, 100 peers is invalid, so maybe we should keep only valid peers and if there is a non-valid peer info, just discard it and continue. What do you think about that?
I agree that we should agree on some level of validating, so should we validate port/IP/DNS names strictly (IPv4 333.333.333.333 is invalid) or not and let a user class decide? I currently cannot remember any other thing that can cause problems, but maybe there are more things.
PeerInfo – that might be the first class which we could substitute with Uri. The actual contents and format of the URI does not need to be known by class that stores it and passes it around (e.g. BitTorrent’s Transfter). TorrentResponseParser would convert the data from the bencoded dictionary into an Uri object, which TransferSession would split into its components to connect to the host and do the handshaking. We could use the scheme bt-peer (which does not to be formally registered as we only use it internally):
[snip]
The advantage of using the hierarchical variant is that we can easily extract the individual fields using existing Uri functions and then convert them to the corresponding types (QhostAddress, uint16, PeerId). When such a URI is passed to a TransferSession (or TransferSessionFactory), it will first assert that the scheme is bt-peer, then will know how to make a tcp URI (from the authority component of the bt-peer URI) which will be passed to a TransportFactory object that will create the corresponding SocketTransport. I’m not clear about how to pass additional transport parameters to TransportFactory in a generic way (e.g. read and write buffer sizes). Maybe append a query item to the tcp URI? We should experiment…PeerId could be a class like VendorCode – it stores the raw ID and would provide helpers to extract the Client name and version. What do you think?
Sounds really interesting! I ran into a few problems with PeerInfo class in the TrackerResponseParser (port/IP/DNS names, as described above) and what you wrote could solve them all :). Especially IP/DNS name differencings in the TrackerResponseParser could vanish, so the implementation would become more clearer. So, if I understand it correctly, the peer info parsing would be something like the following one:
Create PeerInfoList class instance
Get bencoded info peer dictionary
foreach peer_info:
Get IP/DNS name from it (string), don't validate it and pass it directly to a new Uri object
Get port (int), (validate it, or not?) and pass it to the Uri object
Get peer_id (string), validate the length
If peer_id is valid:
Pass it to the Uri object and append the Uri to the PeerInfoList class instance
Else:
Discard this peer info
I personally think that it's definitely worth at least trying!
Petr
