[pbs-devel] [RFC backup 0/6] Two factor authentication

Wed Dec 2 15:21:07 CET 2020

On Wed, Dec 02, 2020 at 03:05:42PM +0100, Thomas Lamprecht wrote:
> On 02.12.20 14:35, Oguz Bektas wrote:
> > On Wed, Dec 02, 2020 at 02:07:25PM +0100, Thomas Lamprecht wrote:
> >> On 02.12.20 13:35, Oguz Bektas wrote:
> >>> On Wed, Dec 02, 2020 at 01:27:47PM +0100, Thomas Lamprecht wrote:
> >>>>> 3. don't store all the tfa information in a single json file.
> >>>>>
> >>>>
> >>>> makes no sense to me, any reason you mention below can happen to
> >>>> arbitrary files, so just adds complexity while not gaining
> >>>> anything.
> >>>>
> >>>>> current version uses a single /etc/proxmox-backup/tfa.json file
> >>>>> which holds all the tfa info for all the users. this is a single
> >>>>> point of failure because: - file can be corrupted, causing tfa
> >>>>> to break for everyone (no more logins) - file could get deleted,
> >>>>> disabling/bypassing 2fa for everyone - file could get leaked in
> >>>>> a backup etc., giving everyone's tfa secrets and/or recovery
> >>>>> keys to attackers (bypass everything)
> >>>>>
> >>>>> better is to at least create a file for each user:
> >>>>> /etc/proxmox-backup/tfa/<username>.json or similar
> >>>>>
> >>>>> this way the damage is contained if for example the config
> >>>>> breaks because of incorrect deserialization etc.
> >>>>
> >>>> Why would deserialisation be incorrect for one single file but
> >>>> magically works if multiple files? Makes no sense.
> >>>
> >>> of course this can happen on arbitrary files...  i don't see why
> >>> it would add any complexity to use multiple files though (actually
> >>> makes it simpler imo). the reasoning behind this was to avoid a
> >>> single point of failure like i explained:
> >>>
> >>> multiple files for users -> only that user is affected by broken
> >>> config, other users can log in single file for all users -> all
> >>> users affected if config breaks and nobody can log in
> >>
> >> See that almost as anti-feature, it's actually better if such a
> >> thing happens that it's broken for all, as then one gets admin
> >> attention and can actually look for the underlying root cause -
> >> which at that point is probably memory or disk corruption/failure -
> >> or where does wolfgangs serializer breaks for all in one but not
> >> for split??
> >>
> >>
> >>>
> >>> so the point wasn't to magically fix (potential) incorrect
> >>> deserialization but to reduce breakage in case something like that
> >>> happens.
> >>
> >>
> >> like "what" happens? There's no such thing as one serialization is
> >> fine and the other not - if you start assuming that transient error
> >> model you cannot do anything at all anymore!
> > 
> > as i explained already, it's not about if one serialization is fine
> > and the other isnt; if we have one big mess of a json file holding
> > all the secrets of everyone's tfa config, and at any point there's
> > some bug in the serializer or any other component that interacts
> > with this, then this can lead to DOS of ALL accounts on the server
> > (or compromise of ALL secrets in that file). the model is different
> > than the normal authentication mechanism with pam/pbs realms, since
> > the 2fa configuration has (untrusted) user input that gets
> > serialized and added into a root-owned file during the setup.
> > letting any user on any realm do this is IMO bad practice.
> 
> It's not a mess it's clearly structured. Serde does just a fine job
> serializing JSON, a simple format to escape, plus we define schemas
> with validation for that exact reason.
> 
> > 
> > furthermore we could easily add a check during auth to see if the
> > tfa.json parses to correct json, and if not pop up an error message
> > like "2FA configuration invalid, please contact administrator" etc.
> > and even automatically send an email to root at pam ...
> 
> That's what serde already does, it errors if not valid JSON, which
> then erros the login (did not looked at it, but would assume that a
> error there does not just defuses TFA completely...) ...
yes instead of erroring the login without any explanation we can do like
i suggested. that way we still get notified if something is wrong, and
without DOSing the whole server ;)
> 
> > 
> >>
> >> I rather have it corrupt for all files as then the admin needs to
> >> fix it and we get notified, as some "magic" bug that only happens
> >> if it's a Tuesday and full moon.
> >>
> >> So no I do *not* want to have user.cfg, token.cfg, shadow.json with
> >> all info in one file, and then start to split TFA for every user,
> >> because of an error model which just assumes whatever one wishes.
> >>
> >>>>
> >>>>> 5. notify user if more than X failed tfa attempts (password is
> >>>>> already compromised at this point, so it's important to notify)
> >>>>> and block IP for certain amount of time (fail2ban?)
> >>>>
> >>>> we do not setup fail2ban but any admin can already if wished.
> >>>> Notification can only work if the user has setup a mail in the
> >>>> first place - but yes, sou
> >>> yes, but imo 2fa is more sensitive to bruteforcing than regular
> >>> passwords so it would make sense to limit it by default
> >>
> >> why is it more sensitive? I need both, so it's the same? If I get
> >> leaked shadow and tfa, I need to break both, only one has no use -
> >> that's the idea of TFA...
> > 
> > it's more sensitive to bruteforcing; because of limited keyspace, as
> > in it's easier to bruteforce a 6 digit numerical passcode than a
> > regular passphrase in most circumstances. if attacker cracks/steals
> > a password and is presented with a 2fa screen, it should be unlikely
> > for them to bypass/crack that.  if i get unlimited tries to crack a
> > 6-digit code you'll eventually get it right.
> 
> You have about 2 time windows to get through all combinations of 10^6
> with a forced response delay of 3 seconds + network latency, so 20
> tries max before the time change so much that you need to start
> again...
> 
> > 
> > that's why i think attempts should be limited by default and not
> > reliant on fail2ban, because there's no use case where anyone tries
> > to enter a totp code a thousand times for any legitimate reason...
> > (however you could forget/lose your password easily so it's more
> > acceptable to let someone keep trying in the regular auth case)
> 
> but fail2ban can cope with the difference between >3 tries per minute,
> so why exactly 
> 
> >>>>
> >>>>>
> >>>>> 5.b also if recovery keys are available, limit amount of TOTP
> >>>>> attempts for that user
> >>>>
> >>>> what?
> >>>>
> >>>
> >>> if a user sets up TOTP + recovery keys, then it would make sense
> >>> to lock account in case of a lot of auth attempts with TOTP, until
> >>> recovery key is entered (afaik this is a common mechanism). but
> >>> maybe just notifying the user is enough as well.
> >>
> >> and why do you place more trust onto the fixed recovery keys than
> >> another TFA option? 
> > the same reason i explained above, this would only kick in when the
> > TOTP is disabled because of too many auth failures. if a user has
> > set up recovery keys then they can be already used instead of TOTP
> > (the option is there regardless). so it's not placing more trust on
> > the recovery keys.
> 
> It sure is, because you say that recovery keys are still good when u2f
> is not anymore, that implies you trusting it more that u2f or other
> variants.

no... that's just how recovery keys work (they are usable at *any* time
until used once). the decision to set up recovery key or not is up to
the user. this lockout mechanism would be only with TOTP setup enabled +
recovery keys also enabled (in case it wasn't clear in the previous
mails).

> 
> > 
> > the flow could be something like this:
> > 1. user sets up 2fa, TOTP and recovery keys
> > 2. attacker login with stolen password
> > 3. attacker attempts to crack 2fa totp code
> > 4. fails after 3/5/X attempts, user gets notified and TOTP is disabled
> > 5. at this point user can only log in with password + recovery code. (which they
> > could anyway, even if TOTP is present)
> > 
> >> Which services/programs/websites do that, can you name a few examples?
> > 
> > afaik some "secure" email providers like protonmail/tutanota etc. use
> 
> proton mail seems to use just 8 hex characters as recovery key and I see nowhere
> any description for the behaviour you suggest..
> 
> https://protonmail.com/support/knowledge-base/two-factor-authentication/
my bad then
> 
> > this kind of mechanism (account password + mailbox is encrypted with
> > password, and recovery keys in case all else is lost/locked).
> > 
> > i'm sure there are other examples as well
> 
> I'm then sure you can list them, for now we're at 0 examples with actual
> source ;-)

here one example: https://docs.genesys.com/Documentation/GAAP/latest/iaHelp/2FA
"If the result is successful (i.e. the code is valid), Intelligent
Automation grants access. If the result is unsuccessful (i.e. the code
is invalid), Intelligent Automation prompts the user to enter another
code, until the Maximum Attempts at Sending an Authentication Code value
is reached."

so it is definitely a thing...

here is some explanation for the mathematics behind it: https://security.stackexchange.com/questions/185905/maximum-tries-for-2fa-code

if lockout isn't preferred, another solution would be for example to
increase the delay in a linear fashion after every failed 2fa auth attempt
(gets longer to auth for that IP address each time TOTP code failed).
however this can also be easily bypassed by using proxies etc. during
bruteforce so i'd prefer a lockout policy instead.

>